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D scription 

The present invention relates to the field of molecular biology. In particular, it relates to, among other things, nu- 
cleotide sequences of Staphylococcus aureus, contigs, ORFs, fragments, probes, primers and related polynucleotides^ 
s thereof, peptides and polypeptides encoded by the sequences, and uses of the polynucleotides and sequences thereof, 
such as in fermentation, polypeptide production, assays and pharmaceutical development, among others. 

The genus Staphylococcus includes at least 20 distinct species. (For a review see Novick, R R, The Staphyloco- 
ccus as a Molecular Genetic System, Chapter 1, pgs. 1-37 in MOLECULAR BIOLOGY OF THE STAPHYLOCOCCI, 
R. Novick, Ed., VCH Publishers, New York (1990)). Species differ from one another by 80% or more, by hybridization 
10 kinetics, whereas strains within a species are at least 90% identical by the same measure. 

The species Staphylococcus aureus, a gram-positive, facultatively aerobic, clump-forming cocci, is among the 
most important etiological agents of bacterial infection in humans, as discussed briefly below. 

Human Health and S. Aureus 

15 . .... ' 

Staphylococcus aureus is a ubiquitous pathogen. (See, for instance, Mims et a/., MEDICAL MICROBIOLOGY 
Mosby-Year Book Europe Limited, London, UK (1 993)). It is an etiological agent of a variety of conditions, ranging in 
severity from mild to fatal. A few of the more common conditions caused by S aureus infection are burns, cellulitis, 
eyelid infections, food poisoning, joint infections, neonatal conjunctMtis.osteomyelitis, skin infections, surgical wound 
20 infection, scalded skin syndrome and toxic shock syndrome, some of which are described further below 

Burns . ' . 

Burn wounds generally are sterile initially. However, they generally compromise physical and immune barriers to 
ss infection, cause loss of fluid and electrolytes and result in local or general physiological dysfunction. After cooling, 
contact with viable bacteria results in mixed colonization at the injury site. Infection may be restricted to the non-viable 
debris on the burn surface ("eschar"), it may progress into full skin infection and invade viable tissue below the eschar 
and it may reach below the skin, enter the lymphatic and blood circulation and develop into septicaemia. S; aureus is 
among the most important pathogens typically found in burn wound infections. It can destroy granulation tissue and 
30 produce severe septicaemia. 

Cellulitis 

Cellulitis, an acute infection of the skin that expands from a typically superficial origin to spread below the cutaneous 
35 layer, most commonly is caused by S. aureus in conjunction with S. pyrogenes. Cellulitis can lead to systemic infection. 
In fact, cellulitis can be one aspect of synergistic bacterial gangrene. This condition typically is caused by a mixture of 
S. aureus and microaerophilic streptococci It causes necrosis and treatment is limited to excision of the necrotic tissue. 
The condition often is fatal. 

40 Eyelid infections 

S. aureus is the cause of styes and of sticky eye" in neonates, among other eye infections. Typically such infections 
are limited to the surface of the eye, and may occasionally penetrate the surface with more severe consequences. 

45 Food poisoning 

Some strains of S. aureus produce one or more of five serologically distinct, heat and acid stable enterotoxins that 
are not destroyed by digestive process of the stomach and small intestine (enterotoxins A-E). Ingestion of the toxin, 
in sufficient quantities, typically results in severe vomiting, but not diarrhoea. The effect does not require viable bacteria. 
so Although the toxins are known, their mechanism of action is not understood. 

Joint infections 

S. aureus infects bone joints causing diseases such osteomyelitis. 

55 

Osteomyelitis 

S. aureus is the most common causative agent of haematogenous osteomyelitis. The disease tends to occur in 
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children and adolescents more than adults and it is associated with non-penetrating injuries to bones. Infection typically 
occurs in the long end of growing bone, hence its occurrence in physically immature populations. Most often, infection 
is localized in the vicinity of sprouting capillary loops adjacent to epiphysial growth plates in the end of long, growing 
bones. 

5 

Skin infections 

S. aureus is the most common pathogen of such minor skin infections as abscesses and boils. Such infections 
often are resolved by normal host response mechanisms, but they also can develop into severe internal infections. 
10 Recurrent infections of the nasal passages plague nasal carriers of S. aureus. 

Surgical Wound Infections 

Surgical wounds often penetrate far into the body. Infection of such wound thus poses a grave risk to the patient. 
is s. aureus is the most important causative agent of infections in surgical wounds. S. aureus is unusually adept at 
invading surgical wounds; sutured wounds can be infected by far fewer S. aureus cells then are necessary to cause 
infection in normal skin. Invasion of surgical wound can lead to severe S. aureus septicaemia. Invasion of the blood 
stream by S. aureus can lead to seeding and infection of internal organs, particularly heart valves and bone, causing 
systemic diseases, such as endocarditis and osteomyelitis. 

20 

Scalded Skin Syndrome 

S. aureus is responsible for "scalded skin syndrome" (also called toxic epidermal necrosis, Ritter's disease and 
Lyell's disease). This diseases occurs in older children, typically in outbreaks caused by flowering of S. aureus strains 
25 produce exfoliation(also called scalded skin syndrome toxin). Although the bacteria initially may infect only a minor 
lesion, the toxin destroys intercellular connections, spreads epidermal layers and allows the infection to penetrate the 
outer layer of the skin, producing the desquamation that typifies the diseases. Shedding of the outer layer of skin 
generally reveals normal skin below, but fluid lost in the process can produce severe injury in young children if it is not 
treated properly. 

30 

Toxic Shock Syndrome 

Toxic shock syndrome is caused by strains of S. aureus that produce the so-called toxic shock syndrome toxin. 
The disease can be caused by S. aureus infection at any site, but it is too often erroneously viewed exclusively as a 
35 disease solely. of women who use tampons. The disease involves toxaemia and septicaemia, and can be fatal. 

Nocosomial Infections 

In the 1984 National Nocosomial Infection Surveillance Study ("NNIS") S. aureus was the most prevalent agent 
40 of surgical wound infections in many hospital services, including medicine, surgery, obstetrics, pediatrics and newborns. 

Resistance to drugs of S. aureus strains 

Prior to the introduction of penicillin the prognosis for patients seriously infected with S. aureus was unfavorable. 
45 Following the introduction of penicillin in the early 1 940s even the worst S. aureus infections generally could be treated 
successfully. The emergence of penicillin-resistant strains of S. aureus did not take long, however. Most strains of S. 
aureus encountered in hospital infections today do not respond to penicillin; although, fortunately, this is not the case 
for S. aureus encountered in community infections. 

It is well known now that penicillin-resistant strains of S. aureus produce a lactamase which converts penicillin to 
50 pencillinoic acid, and thereby destroys antibiotic activity. Furthermore, the lactamase gene often is propagated episo- 
mally, typically on a plasmid, and often is only one of several genes on an episomal element that, together, confer 
multidrug resistance. 

Methicillins, introduced in the 1960s, largely overcame the problem of penicillin resistance in S. aureus. These 
compounds conserve the portions of penicillin responsible for antibiotic activity and modify or alter other portions that 
5$ make penicillin a good substrate for inactivating lactamases. However, methicilNn resistance has emerged in S. aureus, 
along with resistance to many other antibiotics effective against this organism, including aminoglycosides, tetracycline, 
chloramphenicol, macrolides and lincosamides. In fact, methicillin-resistant strains of S. aureus generally are multiply 
drug resistant. 
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The molecular genetics of most types of drug resistance in S. aureus has been elucidated (See Lyon etal, Micro- 
biology Reviews 5V. 88-1 34 (1 987)). Generally, resistance is mediated by piasmids, as noted above regarding penicillin 
resistance; however, several stable forms of drug resistance have been observed that apparently involve integration 
of a resistance element into the S. aureus genome itself. 

Thus far each new antibiotic gives rise to resistance strains, stains emerge that are resistance to multiple drugs 
and increasingly persistent forms of resistance begin to emerge. Drug resistance of S. aureus infections already poses 
significant treatment difficulties, which are likely to get much worse unless new therapeutic agents are developed. 

Molecular Genetics of Staphylococcus Aureus 



Despite its importance in, among other things, human disease, relatively little is known about the genome of this 
organism. 

Most genetic studies of S. aureus have been carried out using the the strain NCTC8325, which contains prophages 
psi11 psi12 and psi13, and the UV-cured derivative of this strain, 8325-4 (also referred to as RN450), which is free of 
is the prophages. 

These studies revealed that the S. aureus genome, like that of other staphylococci, consists of one circular, cov- , 
alently closed, double-stranded DNA and a collection of so-called variable accessory genetic elements, such as 
prophages, piasmids, transposons and the like. 

Physical characterization of the genome has not been carried out in any detail. Pattee et al published a low res- 

20 olution and incomplete genetic and physical map of the chromosome of S. aureus strain NCTC 8325. (Pattee et ai 
Genetic and Physical Mapping of Chromosome of Staphylococcus aureus NCTC 8325, Chapter 11, pgs. 163-169 in. 
MOLECULAR BIOLOGY OF THE STAPHYLOCOCCI, R.P. Novick, Ed., VCH Publishers, New York, (1990) The genetic 
map largely was produced by mapping insertions of Tn551 and Tn4001 , which, respectively, confer erythromycin and 
gentamicin resistance, and by analysis of Smal-digested DNA by Pulsed Field Gel Electrophoresis ("PFGE"). 

25 The map was of low resolution; even estimating the physical size of the genome was difficult, according to the 

investigators. The size of the largest Smal chromosome fragment, for instance, was too large for accurate sizing by 
PFGE. To estimate its size, additional restriction sites had to be introduced into the chromosome using a transposon 
containing a Smal recognition sequence. 

In sum, most physical characteristics and almost all of the genes.of Staphylococcus aureus are unknown. Among 

30 \r\e few genes that have been identified, most have not been physically mapped or characterized in detail Only a very 
few genes of this organism have been sequenced. (See, for instance Thornsberry, J. , Antimicrobial Chemotherapy 21 
Suppi C ; 9-16 (1988), current versions of GEN BANK and other nucleic acid databases, and references that relate to 
the genome of S. aureus such as those set out elsewhere herein.) 

it is clear that the etiology of diseases mediated or exacerbated by S. aureus infection involves the programmed 

35 expression of S. aureus genes, and that characterizing the genes and their patterns of expression would add dramat- 
ically to our understanding of the organism and its host interactions. Knowledge of S. aureus genes and genomic 
organization would dramatically improve understanding of disease etiology and lead to improved and new ways of 
preventing, ameliorating, arresting and reversing diseases. Moreover, characterized genes and genomic fragments of 
S. aureus would provide reagents for, among other things, detecting, characterizing and controlling s, aureus infections. 

40 There is a need therefore to characterize the genome of S. aureus and for polynucleotides and sequences of this 
organism. 

The present invention is based on the sequencing of fragments of the Staphylococcus aureus genome. The primary 
nucleotide sequences which were generated are provided in SEQ ID NOS: 1-5,191. 

The present invention provides the nucleotide sequence of several thousand contigs of the Staphylococcus aureus 
45 genome, which are listed in tables below and set out in the Sequence Listing submitted herewith, and representative 
fragments thereof, in a form which can be readily used, analyzed, and interpreted by a skilled artisan. In one embod- 
iment, the present invention is provided as contiguous strings of primary sequence information corresponding to the 
nucleotide sequences depicted in SEQ I D NOS: 1 -5, 1 91 . 

The present invention further provides nucleotide sequences which are at least 95%, preferably 99% and most 
so preferably 99.9%, identical to the nucleotide sequences of SEQ ID NOS:1-5,191 . 

The nucleotide sequence of SEQ ID NOS: 1-5, 191, a representative fragment thereof, or a nucleotide sequence 
which is at least 95%, preferably 99% and most preferably 99.9%, identical to the nucleotide sequence of SEQ ID 
NOS- 1-5,1 91 may be provided in a variety of mediums to facilitate its use. In one application of this embodiment, the 
sequences of the present invention are recorded on computer readable media. Such media includes, but is not limited 
55 to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media 
such as CD-ROM; electrical storage media such as RAM and ROM; and hybrids of these categories such as magnetic/ 
optical storage media. 

The present invention further provides systems, particularly computer-based systems which contain the sequence 
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information herein described stored in a data storage means. Such systems are designed to identify commercially 
important fragments of the Staphylococcus aureus genome. 

Another embodiment of the present invention is directed to fragments, preferably isolated fragments, of the Sta- 
phylococcus aureus genome having particular structural or functional attributes. Such fragments of the Staphylococcus 
s aureus genome of the present invention include, but are not limited to, fragments which encode peptides, hereinafter 
referred to as open reading frames or ORFs," fragments which modulate the expression of an operably linked OR F, 
hereinafter referred to as expression modulating fragments or EMFs," and fragments which can be used to diagnose 
the presence of Staphylococcus aureus in a sample, hereinafter referred to as diagnostic fragments or "DFs. a 

Each of the ORFs in fragments of the Staphylococcus aureus genome disclosed in Tables 1 -3, and the EMFs 
10 found 5' to the ORFs, can be used in numerous ways as polynucleotide reagents. For instance, the sequences can be 
used as diagnostic probes or amplification primers for detecting or determining the presence of a specific microbe in 
a sample, to selectively control gene expression in a host and in the production of polypeptides, such as polypeptides 
encoded by ORFs of the present invention, particular those polypeptides that have a pharmacological activity. 

The present invention further includes recombinant constructs comprising one or more fragments of the Staphy- 
is lococcus aureus genome of the present invention. The recombinant constructs of the present invention comprise vec- 
tors, such as a plasmid or viral vector, into which a fragment of the Staphylococcus aureus has been inserted. 

The present invention further provides host cells containing any of the isolated fragments of the Staphylococcus 
aureus genome of the present invention. The host cells can be a higher eukaryotic host cell, such as a mammalian 
cell, a lower eukaryotic cell, such as a yeast cell, or a procaryotic cell such as a bacterial cell. 
20 The present invention is further directed to polypeptides and proteins, preferably isolated polypeptides and pro- 

teins, encoded by ORFs of the present invention. A variety of methods, well known to those of skill in the art, routinely 
may be utilized to obtain any of the polypeptides and proteins of the present invention. For instance, polypeptides and 
proteins of the present invention having relatively short, simple amino acid sequences readily can be synthesized using 
commercially available automated peptide synthesizers. Polypeptides and proteins of the present invention also may 
25 be purified from bacterial cells which naturally produce the protein. Yet another alternative is to purify polypeptide and 
proteins of the present invention can from cells which have been altered to express them. 

The invention further provides polypeptides, preferably isolated polypeptides, comprising Staphylococcus aureus 
epitopes and vaccine compositions comprising such polypeptides. Also provided are methods for vacciniating an in- 
dividual against Staphylococcus aureus infection. 
30 The invention further provides methods of obtaining homologs of the fragments of the Staphylococcus aureus A 

genome of the present invention and homologs of the proteins encoded by the ORFs of the present invention. Specif- 
ically, by using the nucleotide and amino acid sequences disclosed herein as a probe or as primers, and techniques 
such as PGR cloning and colony/plaque hybridization, one skilled in the art can obtain homologs. 

The invention further provides antibodies which selectively bind polypeptides and proteins of the present invention. 
35 Such antibodies include both monoclonal and polyclonal antibodies. 

The invention further provides hybridomas which produce the above-described antibodies. A hybridoma is an* 
immortalized cell line which is capable of secreting a specific monoclonal antibody. 

The present invention further provides methods of identifying test samples derived from cells which express one 
of the ORFs of the present invention, or a homolog thereof. Such methods comprise incubating a test sample with one 
40 or more of the antibodies of the present invention, or one or more of the Dfs or antigens of the present invention, under 
conditions which allow a skilled artisan to determine if the sample contains the ORF or product produced therefrom. 

In another embodiment of the present invention, kits are provided which contain the necessary reagents to carry 
out the above-described assays. 

Specifically, the invention provides a compartmentalized kit to receive, inclose confinement, one or more containers 
45 which comprises: (a) a first container comprising one of the antibodies, antigens, or one of the DFs of the present 
invention; and (b) one or more other containers comprising one or more of the following:wash reagents, reagents 
capable of detecting presence of bound antibodies, antigens or hybridized DFs. 

Using the isolated proteins of the present invention, the present invention further provides methods of obtaining 
and identifying agents capable of binding to a polypeptide or protein encoded by one of the ORFs of the present 
50 invention. Specifically, such agents include, as further described below, antibodies, peptides, carbohydrates, pharma- 
ceutical agents and the like. Such methods comprise steps of: (a)contacting an agent with an isolated protein encoded 
by one of the ORFs of the present invention; and (b)determining whether the agent binds to said protein. 

The present genomic sequences of Staphylococcus aureus will be of great value to all laboratories working with 
this organism and for a variety of commercial purposes. Many fragments of the Staphylococcus aureus genome will 
55 be immediately identified by similarity searches against GenBank or protein databases and will be of immediate value 
to Staphylococcus aureus researchers and for immediate commercial value for the production of proteins or to control 
gene expression. 

The methodology and technology for elucidating extensive genomic sequences of bacterial and other genomes 
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has and will greatly enhance the ability to analyze and understand chromosomal organization. In particular, sequenced 
contigs and genomes will provide the models for developing tools for the analysis of chromosome structure and function, 
including the ability to identify genes within large segments of genomic DNA, the structure, position, and spacing of 
regulatory elements, the identification of genes with potential industrial applications, and the ability to do comparative 
s genomic and molecular phylogeny. 

FIGURE 1 is a block diagram of a computer system (1 02) that can be used to implement computer-based systems 
of present invention. 

FIGURE 2 is a schematic diagram depicting the data flow and computer programs used to collect, assemble, edit 
and annotate the contigs of the Staphylococcus aureus genome of the present invention. Both Macintosh and Unix 

10 platforms are used to handle the AB 373 and 377 sequence data files, largely as described in Kerlavage et ai, Pro- 
ceedings of the Twenty-Sixth Annual Hawaii International Conference on System Sciences, 585, IEEE Computer So- , 
ciety Press, Washington D.C. (1993). Factura (AB) is a Macintosh program designed for automatic vector sequence 
removal and end-trimming of sequence files. The program Loadis runs on a Macintosh platform and parses the feature 
data extracted from the sequence files by Factura to the Unix based Staphylococcus aureus relational database. As- 

15 sembly of contigs (and whole genome sequences) is accomplished by retrieving a specific set of sequence files and 
their associated features using extrseq, a Unix utility for retrieving sequences from an SQL database. The resulting 
sequence file is processed by seq_filter to trim portions of the sequences with more than 2% ambiguous nucleotides. 
The sequence files were assembled using TIGR Assembler, an assembly engine designed at The Institute for Genomic 
Research ( TIGR") for rapid and accurate assembly of thousands of sequence fragments. The collection of contigs 

20 generated by the assembly step is loaded into the database with the lassie program. Identification of open reading 
frames (ORFs) is accomplished by processing contigs with zorf. The ORFs are searched against S. aureus sequences 
from Genbank and against all protein sequences using the BLASTN and BLASTP programs, described in Altschul et 
aL, J. Mol. Biol. 215 : 403-410 (1990)). Results of the ORF determination and similarity searching steps were loaded 
into the database. As described below, some results of the determination and the searches are set out in Tables 1.-3.. 

26 The present invention is based on the sequencing of fragments of the Staphylococcus aureus genome and analysis 

of the sequences. The primary nucleotide sequences generated by sequencing the fragments are provided in SEQ ID 
NOS: 1-5,1 91. (As used herein, the "primary sequence" refers to the nucleotide sequence represented. by the IUPAC 
nomenclature system.) 

In addition to the aforementioned Staphylococcus aureus polynucleotide and polynucleotide sequences, the 
30 present invention provides the nucleotide sequences of SEQ ID NOS: 1-5, 191 , or representative fragments thereof, in 
a form which can be readily used, analyzed, and interpreted by a skilled artisan. 

As used herein, a "representative fragment of the nucleotide sequence depicted in SEQ ID NOS:1-5,191" refers 
to any portion of the SEQ ID NOS:1-5,191 which is not presently represented within a publicly available database. 
Preferred representative fragments of the present invention are Staphylococcus aureus open reading frames ( ORFs"), 
35 expression modulating fragment ( EMFs") and fragments which can be used to diagnose the presence of Staphyloco- 
ccus aureus in sample ("DFs"). A non-limiting identification of preferred representative fragments is provided in Tables 
1-3. 

As discussed in detail below, the information provided in SEQ ID NOS: 1-5, 191 and in Tables 1-3 together with 
routine cloning, synthesis, sequencing and assay methods will enable those skilled in the art to clone and sequence 
40 all "representative fragments" of interest, including open reading frames encoding a large variety of Staphylococcus 
aureus proteins: 

While the presently disclosed sequences of SEQ ID NOS:1 -5, 1 91 are highly accurate, sequencing techniques are 
not perfect and, in relatively rare instances, further investigation of a fragment or sequence of the invention may reveal 
a nucleotide sequence error present in a nucleotide sequence disclosed in SEQ ID NOS:1-5,1 91 . However, once the 

45 present invention is made available (i.e., once the information in SEQ ID NOS: 1 -5,1 91 and Tables 1 -3 has been made 
available), resolving a rare sequencing error in SEQ ID NOS:1 -5,191 will be well within the skill of the art. The present 
disclosure makes available sufficient sequence information to allow any of the described contigs or portions thereof to 
be obtained readily by straightforward application of routine techniques. Further sequencing of such polynucleotide 
may proceed in like manner using manual and automated sequencing methods which are employed ubiquitous in the 

50 art. Nucleotide sequence editing software is publicly available. For example, Applied Biosystem's (AB) AutoAssembler 
can be used as an aid during visual inspection of nucleotide sequences. By employing such routine techniques potential 
errors readily may be identified and the correct sequence then may be ascertained by targeting further sequencing 
effort, also of a routine nature, to the region containing the potential error. 

Even if all of the very rare sequencing errors in SEQ ID NOS:1-5,191 were corrected, the resulting nucleotide 

55 sequences would still be at least 95% identical, nearly all would be at least 99% identical, and the great majority would 
be at least 99.9% identical to the nucleotide sequences of SEQ ID NOS:1 -5, 1 91 . 

As discussed elsewhere hererin, polynucleotides of the present invention readily may be obtained by routine ap- 
plication of well known and standard procedures for cloning and sequencing DNA. Detailed methods for obtaining 
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libraries and for sequencing are provided below, for instance. A wide variety of Staphylococcus aureus strains that can 
be used to prepare S aureus genomic DNA for cloning and for obtaining polynucleotides of the present invention are 
available to the public from recognized depository institutions, such as the American Type Culture Collection (ATCC"). 

The nucleotide sequences of the genomes from different strains of Staphylococcus aureus differ somewhat. How- 
5 ever, the nucleotide sequences of the genomes of all Staphylococcus aureus strains will be at least 95% identical, in 
corresponding part, to the nucleotide sequences provided in SEQ ID NOS:1-5,191. Nearly all will be at least 99% 
identical and the great majority will be 99.9% identical. 

Thus, the present invention further provides nucleotide sequences which are at least 95%, preferably 99% and 
most preferably 99.9% identical to the nucleotide sequences of SEQ ID NOS:1-5,191, in a form which can be readily 
10 used, analyzed and interpreted by the skilled artisan. 

Methods for determining whether a nucleotide sequence is at least 95%, at least 99% or at least 99.9% identical 
to the nucleotide sequences of SEQ I D NOS: 1 -5, 1 9 1 are routine and readily available to the skilled artisan. For example, 
the well known fasta algorithm described in Pearson and Lipman, Proc. Natl. Acad. Sci. USA 85: 2444 (1988) can be 
used to generate the percent identity of nucleotide sequences. The BLASTN program also can be used to generate 
is an identity score of polynucleotides compared to one another. 

COMPUTER RELATED EMBODIMENTS 

The nucleotide sequences provided in SEQ ID NOS:1-5,191, a representative fragment thereof, or a nucleotide 

20 sequence at least 95%, preferably at least 99% and most preferably at least 99.9% identical to a polynucleotide se- 
quence of SEQ ID NOS: 1-5, 191 may be "provided" in a variety of mediums to facilitate use thereof. As used herein, 
6provided" refers to a manufacture, otherthan an isolated nucleic acid molecule, which contains a nucleotide sequence 
of the present invention; i.e., a nucleotide sequence provided in SEQ ID NOS:1-5,191, a representative fragment 
thereof, or a nucleotide sequence at least 95%, preferably at least 99% and most preferably at least 99.9% identical 

25 to a polynucleotide of SEQ ID NOS: 1 -5, 1 91 . Such a manufacture provides a large portion of the Staphylococcus aureus 
genome and parts thereof {e.g., a Staphylococcus aureus open reading frame (ORF)) in a form which allows a skilled 
artisan to examine the manufacture using means not directly applicable to examining the Staphylococcus aureus ge- 
nome or a subset thereof as it exists in nature or in purified form. 

In one application of this embodiment, a nucleotide sequence of the present invention can be recorded on computer 

30 readable media. As used herein, "computer readable media" refers to any medium which can be read and accessed 
directly by a computer. Such media include, but are not limited to: magnetic storage media, such as floppy discs, hard 
disc storage medium, and magnetic tape; optical storage media such as CD- ROM; electrical storage media such as 
RAM and ROM; and hybrids of these categories, such as magnetic/optical storage media. A skilled artisan can readily 
appreciate how any of the presently known computer readable mediums can be used to create a manufacture com- 

35 prising computer readable medium having recorded thereon a nucleotide sequence of the present invention. Likewise, 
it will be clear to those of skill how additional computer readable media that may be developed also can be used to 
create analogous manufactures having recorded thereon a nucleotide sequence of the present invention. 

As used herein, "recorded" refers to a process for storing information on computer readable medium. A skilled 
artisan can readily adopt any of the presently know methods for recording information on computer readable medium 

40 to generate manufactures comprising the nucleotide sequence information of the present invention. 

A variety of data storage structures are available to a skilled artisan for creating a computer readable medium 
having recorded thereon a nucleotide sequence of the present invention. The choice of the data storage structure will 
generally be based on the means chosen to access the stored information. In addition, a variety of data processor 
programs and formats can be used to store the nucleotide sequence information of the present invention on computer 

45 readable medium. The sequence information can be represented in a word processing text file, formatted in commer- 
cially- available software such as WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored 
in a database application, such as DB2, Sybase, Oracle, or the like. A skilled artisan can readily adapt any number of 
data-processor structuring formats (e.g., text file or database) in order to obtain computer readable medium having 
recorded thereon the nucleotide sequence information of the present invention. 

50 Computer software is publicly available which allows a skilled artisan to access sequence information provided in 

a computer readable medium. Thus, by providing in computer readable form the nucleotide sequences of SEQ ID 
NOS:1-5,191, a representative fragment thereof, or a nucleotide sequence at least 95%, preferably at least 99% and 
most preferably at least 99.9% identical to a sequence of SEQ ID NOS:1-5,191 the present invention enables the 
skilled artisan routinely to access the provided sequence information for a wide variety of purposes. 

55 The examples which follow demonstrate how software which implements the BLAST (Altschul et at., J. Mol. Biol. 

215:403410 (1990)) and BLAZE (Brutlag et al., Comp. Chem. 17:203-207 (1993)) search algorithms on a Sybase 
system was used to identify open reading frames (ORFs) within the Staphylococcus aureus genome which contain 
homology to ORFs or proteins from both Staphylococcus aureus and from other organisms. Among the ORFs discussed 
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herein are protein encoding fragments of the Staphylococcus aureus genome useful in producing commercially impor- 
tant proteins, such as enzymes used in fermentation reactions and in the production of commercially useful metabolites. 

The present invention further provides systems, particularly computer-based systems, which contain the sequence 
information described herein. Such systems are designed to identify, among other things, commercially important f rag- 
5 ments of the Staphylococcus aureus genome. 

As used herein, "a computer-based system" refers to the hardware means, software means, and data storage 
means used to analyze the nucleotide sequence information of the present invention. The minimum hardware means 
of the computer-based systems of the present invention . comprises a central processing unit (CPU), input means, 
output means, and data storage means. A skilled artisan can readily appreciate that any one of the currently available 
10 computer-based system are suitable for use in the present invention. 

As stated above, the computer-based systems of the present invention comprise a data storage means having 
stored therein a nucleotide sequence of the present invention and the necessary hardware means and software means 
for supporting and implementing a search means. 

As used herein, "data storage means" refers to memory which can store nucleotide sequence information of the 
is present invention, or a memory access means which can access manufactures having recorded thereon the nucleotide 
sequence information of the present invention. 

As used herein, "search means" refers to one or more programs which are implemented on the computer- based 
system to compare a target sequence or target structural motif with the sequence information stored within the data 
storage means. Search means are used to identify fragments or regions of the present genomic sequences which 
20 match a particular target sequence or target motif. A variety of known algorithms are disclosed publicly and a variety 
of commercially available software for conducting search means are and can be used in the computer^ased systems 
of the present invention. Examples of such software includes, but is not limited to, MacPattern (EMBL), BLASTN and 
BLASTX (NCBIA). A skilled artisan can readily recognize that any one of the available algorithms or implementing 
software packages for conducting homology searches can be adapted for use in the present computer-based systems. 
25 As used herein, a "target sequence" can be any DNA or amino acid sequence of six or more nucleotides or two 

or more amino acids. A skilled artisan can readily recognize that the longer a target sequence is, the less likely a target 
sequence will be present as a random occurrence in the database. The most preferred sequence length of a target 
sequence is from about 1 0 to 1 00 amino acids or from about 30 to 300 nucleotide residues. However, it is welt recognized 
that searches for commercially important fragments, such as-sequence fragments involved in gene expression and 
30 protein processing, may be of shorter length. 

As used herein, "a target structural motif," or "target motif," refers to any rationally selected sequence or combi- 
nation of sequences in which the sequence(s) are chosen based on a three-dimensional configuration which is formed 
upon the folding of the target motif. There are a variety of target motifs known Jathe art. Protein target motifs include, 
but are not limited to, enzymic active sites and signal sequences. Nucleic acid target motifs include, but are not limited 
35 to, promoter sequences, hairpin structures and inducible expression elements (protein binding sequences). 

A variety of structural formats for the input and output means can be used to input and output the information in 
the computer-based systems of the present invention. A preferred format for an output means ranks fragments of the 
Staphylococcus aureus genomic sequences possessing varying degrees of homology to the target sequence or target 
motif. Such presentation provides a skilled artisan with a ranking of sequences which contain various amounts of the 
40 target sequence or target motif and identifies the degree of homology contained in the identified fragment. 

A variety of comparing means can be used to compare a target sequence or target motif with the data storage 
means to identify sequence fragments of the Staphylococcus aureus genome. In the present examples, implementing 
software which implement the BLAST and BLAZE algorithms, described in Attschul et al. t J. Mol. Biol. 215 : 403-410 
(1990), was used to identify open reading frames within the Staphylococcus aureus genome. A skilled artisan can 
45 readily recognize that any one of the publicly available homology search programs can be used as the search means 
for the computer-based systems of the present invention. Of course, suitable proprietary systems that may be known 
to those of skill also may be employed in this regard. 

Figure 1 provides a block diagram of a computer system illustrative of embodiments of this aspect of present 
invention. The computer system 1 02 includes a processor 106 connected to a bus 104. Also connected to the bus 104 
so are a main memory 1 08 (preferably implemented as random access memory, RAM) and a variety of secondary storage 
devices 110, such as a hard drive 112 and a removable medium storage device 114. The removable medium storage 
device 114 may represent, for example, a floppy disk drive, a CD-ROM drive, a magnetic tape drive, etc. A removable 
storage medium 116 (such as a floppy disk, a compact disk, a magnetic tape, etc.) containing control logic and/or data 
recorded therein may be inserted into the removable medium storage device 114. The computer system 102 includes 
ss appropriate software for reading the control logic and/or the data from the removable medium storage device 1 1 4, once 
it is inserted into the removable medium storage device 114. 

A nucleotide sequence of the present invention may be stored in a well known manner in the main memory 108, 
any of the secondary storage devices 110, and/or a removable storage medium 116. During execution, software for 
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accessing and processing the genomic sequence (such as search tools, comparing tools, etc.) reside in main memory 
108, in accordance with the requirements and operating parameters of the operating system, the hardware system 
and the software program or programs. 

s BIOCHEMICAL EMBODIMENTS 

Other embodiments of the present invention are directed to fragments of the Staphylococcus aureus genome, 
preferably to isolated fragments. The fragments of the Staphylococcus aureus genome of the present invention include, 
but are not limited to fragments which encode peptides, hereinafter open reading frames (ORFs), fragments which 
10 modulate the expression of an operably linked ORF, hereinafter expression modulating fragments (EMFs) and frag- 
ments which can be used to diagnose the presence of Staphylococcus aureus in a sample, hereinafter diagnostic 
fragments (DFs). 

As used herein, an "isolated nucleic acid molecule" or an "isolated fragment of the Staphylococcus aureus genome" 
refers to a nucleic acid molecule possessing a specific nucleotide sequence which has been subjected to purification 
is means to reduce, from the composition, the number of compounds which are normally associated with the composition. 
Particularly, the term refers to the nucleic acid molecules having the sequences set out in SEQ ID NOS:1-5,191 , to 
representative fragments thereof as described above, to polynucleotides at least 95%, preferably at least 99% and 
especially preferably at least 99.9% identical in sequence thereto, also as set out above. 

A variety of purification means can be used to generated the isolated fragments of the present invention. These 
so include, but are not limited to methods which separate constituents of a solution based on charge, solubility, or size. 

In one embodiment, Staphylococcus aureus DNA can be mechanically sheared to produce fragments of 1 5-20 kb 
in length. These fragments can then be used to generate an Staphylococcus aureus library by inserting them into 
lambda clones as described in the Examples below. Primers flanking, for example, an ORF, such as those enumerated 
in Tables 1-3 can then be generated using nucleotide sequence information provided in SEQ ID NOS: 1-5,191. Well 
2S known and routine techniques of PCR cloning then can be used to isolate the ORF from the lambda DNA library of 
Staphylococcus aureus genomic DNA. Thus, given the availability of SEQ ID NOS:1-5,191, the information in Tables 
1, 2 and 3, and the information that may be obtained readily by analysis of the sequences of SEQ ID NOS:1-5,191 
using methods set out above, those of skill will be enabled by the present disclosure to isolate any ORF-containing or 
other nucleic acid fragment of the present invention. 
30 The isolated nucleic acid molecules of the present invention include, but are not limited to single stranded and 

double stranded DNA, and single stranded RNA. 

As used herein, an "open reading frame," OF1F, means a series of triplets coding for amino acids without any 
termination codons and is a sequence translatable into protein. 

Tables'1, '2 and 3 list ORFs in the Staphylococcus aureus genomic contigs of the present invention that were 
35 identified as-putative coding regions by the GeneMark software using organism-specific second-order Markov proba- 
bility transition matrices. It will be appreciated that other criteria can be used, in accordance with well known analytical 
methods, such as those discussed herein, to generate more inclusive, more restrictive or more selective lists. 

Table 1 sets out ORFs in the Staphylococcus aureus contigs of the present invention that are at least 80 amino 
acids long and over a continuous region of at least 50 bases which are 95% or more identical (by BLAST analysis) to 
40 an S. aureus nucleotide sequence available through Genbank in November 1996. 

Table 2 sets out ORFs in the Staphylococcus aureus contigs of the present invention that are not in Table 1 and 
match, with a BLASTP probability score of 0.01 or less, a polypeptide sequence available through Genbank by Sep- 
tember 1996. 

Table 3 sets out ORFs in the Staphylococcus aureus contigs of the present invention that do hot match significantly, 
45 by BLASTP analysis, a polypeptide sequence available through Genbank by September 1 996. 

In each table, the first and second columns identify the ORF by, respectively, contig number and ORF number 
within the contig; the third column indicates the reading frame, taking the first 5' nucleotide of the contig as the start of 
the +1 frame; the fourth column indicates the first nucleotide of the ORF, counting from the 5' end of the contig strand; 
and the, fifth column indicates the length of each ORF in nucleotides. 
so in Tables 1 and 2, column six, lists the Reference" for the closest matching sequence available through Genbank. 

These reference numbers are the databases entry numbers commonly used by those of skill in the art, who will be 
familiar with their denominators. Descriptions of the numenclature are available from the National Center for Biotech- 
nology Information. Column seven in Tables 1 and 2 provides the gene name" of the matching sequence; column eight 
provides the BLAST identity" score from the comparison of the ORF and the homologous gene; and column nine 
55 indicates the length in nucleotides of the highest scoring segment pair" identified by the BLAST identity analysis. 
In Table 3, the last column, column six, indicates the length of each ORF in amino acid residues. 
The concepts of percent identity and percent similarity of two polypeptide sequences is well understood in the art. 
For example, two polypeptides 10 amino acids in length which differ at three amino acid positions (e.g., at positions 
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1 , 3 and 5) are said to have a percent identity of 70%. However, the same two polypeptides would be deemed to have 
a percent similarity of 80% if, for example at position 5, the amino acids moieties, although not identical, were "similar* 1 
(i.e., possessed similar biochemical characteristics). Many programs for analysis of nucleotide or amino acid sequence 
similarity, such as fasta and BLAST specifically list per cent identity of a matching region as an output parameter. Thus, 

5 for instance, Tables 1 and 2 herein enumerate the per cent identity" of the highest scoring segment pair" in each ORF 
and its listed relative. Further details concerning the algorithms and criteria used for homology searches are provided 
below and are described in the pertinent literature highlighted by the citations provided below. 

It will be appreciated that other criteria can be used to generate more inclusive and more exclusive listings of the 
types set out in the tables. As those of skill will appreciate, narrow and broad searches both are useful. Thus, a skilled 

10 artisan can readily identify ORFs in contigs of the Staphylococcus aureus genome other than those listed in Tables 
1-3, such as ORFs which are overlapping or encoded by the opposite strand of an identified ORF in addition to those 
ascertainable using the computer-based systems of the present invention. 

As used herein, an "expression modulating fragment, " EMF, means a series of nucleotide molecules which mod- 
ulates the expression of an operably linked ORF or EMF. 

is As used herein, a sequence is said to "modulate the expression of an operably linked sequence" when the ex- 

pression of the sequence is altered by the presence of the EMF EMFs include, but are not limited to, promoters, and 
promoter modulating sequences (inducible elements). One class of EMFs are fragments which induce the expression 
or an operably linked ORF in response to a specific regulatory factor or physiological event. 

EMF sequences can be identified within the contigs of the Staphylococcus aureus genome by their proximity to 

20 the ORFs provided in Tables 1-3. An intergenic segment, or a fragment of the intergenic segment, from about 10 to 
200 nucleotides in length, taken from any one of the ORFs of Tables 1-3 will modulate the expression of an operably 
linked ORF in a fashion; similar to that found with the naturally linked ORF sequence. As used herein, an 'intergenic 
segment" refers to fragments of the Staphylococcus aureus genome which are between two ORF(s) herein described. 
EMFs also can be identified using known EMFs as a target sequence or target motif in the computer-based systems 

25 of the present invention. Further, the two methods can be combined and used together. 

The presence and activity of an EMF can be confirmed using an EMF trap vector. An EMF trap vector contains a 
cloning site linked to a marker sequence., A marker sequence encodes an identifiable phenotype, such as antibiotic 
resistance or a complementing nutrition auxotrophic factor, which can be identified or assayed when the EMF trap 
vector is placed within an appropriate host under appropriate conditions. As described above, a EMF will modulate the 

30 expression of an operably linked marker sequence. A more detailed discussion of various marker sequences is provided 
below. 

A sequence which is suspected as being an EMF- is cloned in all three reading frames in one or more restriction 
sites upstream from the marker sequence in the EMF trap vector. : The vector is then transformed into an appropriate 
host using known procedures and the phenotype of the transformed host in examined under appropriate conditions. 
35 As described above, an EMF will modulate the expression of an operably, linked marker sequence. 

As used herein, a "diagnostic fragment," DF, means a series of nucleotide molecules which selectively hybridize 
to Staphylococcus aureus sequences. DFs can be readily identified by identifying unique sequences within contigs of 
the Staphylococcus aureus genome, such as by using well-known computer analysis software, and by generating and 
testing probes or amplification primers consisting of the DF sequence in an appropriate diagnostic format which de- 
40 termines amplification or hybridization selectivity. 

The sequences falling within the scope of the present invention are not limited to the specific sequences herein 
described, but also include allelic and species variations thereof. Allelic and species variations can be routinely deter- 
mined by comparing the sequences provided in SEQ ID NOS:1 -5, 1 91 , a representative fragment thereof, or a nucleotide 
sequence at least 95%, preferably 99% and most preferably 99.9% identical to SEQ ID NOS:1 -5,1 91 , with a sequence 
45 from another isolate of the same species. 

Furthermore, to accomodate codon variability, the invention includes nucleic acid molecules coding for the same 
amino acid sequences as do the nucleic acid sequences mentioned above. In other words, in the coding region of an 
ORF, substitution of one codon for another which encodes the same amino acid is expressly contemplated. 

Any specific sequence disclosed herein can be readily screened for errors by resequencing a particular fragment, 
so such as an ORF, in both directions (i.e., sequence both strands). Alternatively, error screening can be performed by 
sequencing corresponding polynucleotides of Staphylococcus aureus origin isolated by using part or all of the fragments 
in question as a probe or primer. 

Each of the ORFs of the Staphylococcus aureus genome disclosed in Tables 1 , 2 and 3, and the EMFs found 5 1 
to the ORFs, can be used as polynucleotide reagents in numerous ways. For example, the sequences can be used 
ss as diagnostic probes or diagnostic amplification primers to detect the presence of a specific microbe in a sample, 
particular Staphylococcus aureus. Especially preferred in this regard are ORF such as those of Table 3, which do not 
match previously characterized sequences from other organisms and thus are most likely to be highly selective for 
Staphylococcus aureus. Also particularly preferred are ORFs that can be used to distinguish between strains of Sta- 
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phyfocdccus aureus, particularly those that distinguish medically important strain, such as drug-resistant strains. 

In addition, the fragments of the present invention, as broadly described, can be used to control gene expression 
through triple helix formation or antisense DNA or RNA, both of which methods are based on the binding of a polynu- 
cleotide sequence to DNAor RNA. triple helix- formation optimally results in a shut^pff of RNA transcription from DNA, 

s while antisense RNA hybridization blocks translation of an mRNA molecule into polypeptide. Information from the 
sequences of the present invention can be used to design antisense and triple helix -forming oligonucleotides. Polynu- 
cleotides suitable for use in these methods are usually 20 to 40 bases in length and are designed to be complementary 
to a region of the gene involved in transcription, for triple-helix formation, or to the mRNA itself, for antisense inhibition. 
Both techniques have been demonstrated to be effective in model systems, and the requisite techniques are well known 

10 and involve routine procedures. Triple helix techniques are discussed in, for example, Lee et al, Nucf. Acids Res. 6: 
3073 (1979); Cooney era/., Science 241 : 456 (1988); and Dervan et al., Science 251 : 1360 (1991). Antisense tech- 
niques in general are discussed in, for instance, Okano, J. Neurochem. 56: 560 (1991) and OLIGODEOXYNUCLE- 
OTIDES AS ANTISENSE. INHIBITORS OF GENE EXPRESSION, CRC Press, Boca Raton, FL (1988)). 

The present invention further provides recombinant constructs comprising one or more fragments of the Staphy- 

is lococcus aureus genomic f ragments and contigs of the present invention. Certain preferred recombinant constructs of 
the present invention comprise a vector, such as a plasmid or viral vector, into which a fragment of the Staphylococcus 
aureus genome has been inserted, in a forward or reverse orientation. In the case of a vector comprising one of the 
ORFs of the present invention, the vector may further comprise regulatory sequences, including for example, a pro- 
moter, pperably linked to the ORF For vectors comprising the EMFs of the present invention, the vector may further 

20 comprise a marker sequence or heterologous ORF operably linked to the EMF 

Large numbers of suitable vectors and promoters are known to those of skill in the art and are commercially 
available for generating the recombinant constructs of the present invention. The following vectors are provided by 
way of example. Useful bacterial vectors include phagescript, PsiX174, pBluescript SK and KS (+ and -), pNH8a, 
pNH16a, pNH18a, pNH46a (available from Stratagene); pTrc99A, pKK223-3, pKK233-3, pDR540, pRIT5 (available 

25 from Pharmacia). Useful eukaryotic Vectors include pWLneo, pSV2cat, pOG44, pXT1 , pSG (available from Stratagene) - 
pSVK3, pBPV, pMSG, pSVL (available from Pharmacia). 

Promoter regions can be selected from any desired gene using CAT (chloramphenicol transferase) vectors or other - 
vectors with 1 selectable markers. Two appropriate vectors are pKK232-8 and pCM7. Particular named bacterial pro- 
moters include lad,. lacZ, T3, T7, gpt, lambda PR, and trc. Eukaryotic promoters include CMV immediate early, HSV 

30 thymidine- kinase, early arid late SV40, LTRs from retrovirus, and mouse metaNothionein- 1. Selection of the appropriate'" 5 
vector and promoter is well within the level of ordinary skill in the art. 

The present invention further provides host cells containing any one of the isolated fragments of the Staphylococcus 
aureus genomic fragments and contigs of the present invention, wherein the fragment has been introduced into the"^ 
host cell using known methods. The host cell can be a higher eukaryotic host cell, such as a mammalian cell, a lower' 

35 eukaryotic, host cell, such as a yeast cell, or a procaryotic cell, such as a bacterial cell. 

A polynucleotide of the present invention, such as a recombinant construct comprising an ORF of the present 
invention, may be introduced into the host by a variety of well established techniques that are standard in the art, such 
as calcium phosphate transfection, DEAE, dextrah mediated transfection and electrpporation, which are described in, 
for instance, Davis, L et at., BASIC METHODS IN MOLECULAR BIOLOGY (1986). 

40 A host cell containing one of the fragments of the Staphylococcus aureus genomic fragments and contigs of the 

present invention, can be used in conventional manners to produce the gene product encoded by the isolated fragment 
(in the case of an ORF) or can be used to produce a heterologous protein under the control of the EMF. 

The present invention further provides isolated polypeptides encoded by the nucleic acid fragments of the present 
invention or by degenerate variants of the nucleic acid fragments of the present invention. By "degenerate variant" is 

45, intended nucleotide fragments which differ from a nucleic acid fragment of the present invention (e.g., an ORF) by 
nucleotide sequence but, due to the degeneracy of the Genetic Code, encode an identical polypeptide sequence. 

Preferred nucleic acid fragments of the present invention are the ORFs depicted in Tables 2 and 3 which encode 
proteins. - 
A variety of methodologies known in the art can be utilized to obtain any one of the isolated polypeptides or proteins 

50 of the present invention. At the simplest level, the amino acid sequence can be synthesized using commercially avail- 
able peptide synthesizers This is particularly useful in producing small peptides and fragments of larger polypeptides: 
Such short fragments as may be obtained most readily by synthesis are useful, for example, in generating antibodies 
against the native polypeptide, as discussed further below 

In an alternative method, the polypeptide or protein is purified from bacterial cells which naturally produce the 

S5 polypeptide or protein. One skilled in the art can readily employ well-known methods for isolating polpeptides and 
proteins to isolate and purify polypeptides or proteins of the present invention produced naturally by a bacterial strain, 
or by other methods. Methods for isolation and purification that can be employed in this regard include, but are not 
limited to, immunochromatography, HPLC, size-exclusion chromatography, ion-exchange chromatography, and immu- 
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no-affinity chromatography. 

The polypeptides and proteins of the present invention also can be purified from cells which have been altered to 
express the desired polypeptide or protein. As used herein, a cell is said to be altered to express a desired polypeptide 
or protein when the cell, through genetic manipulation, is made to produce a polypeptide or protein which it normally 
5 does not produce or which the cell normally produces at a lower level. Those skilled in the art can readily adapt pro : 
cedures for introducing and expressing either recombinant or synthetic sequences into eukaryotic or prokaryotic cells 
in order to generate a cell which produces one of the polypeptides or proteins of the present invention. 

Any host/vector system can be used to express one or more of the ORFs of the present invention. These include, 
but are not limited to, eukaryotic hosts such as HeLa cells, CV-1 cell, COS cells, and Sf9 cells, as well as prokaryotic 
10 host such as E. co//and B. subtilis. The most preferred cells are those which do not normally express the particular 
polypeptide or protein or, which expresses the polypeptide or protein at low natural level. . - - 

"Recombinant," as used herein, means that a polypeptide or protein is derived from recombinant (e.g., microbial 
or mammalian) expression systems. "Microbial" refers to recombinant polypeptides or proteins made in bacterial or 
fungaf (e.g., yeast) expression systems. As a product, "recombinant microbial"defines a polypeptide or protein essen- 
15 tially free of native endogenous substances and unaccompanied by associated native glycosylation. Polypeptides or 
proteins expressed in most bacterial cultures, e.g., E. coli, will be free of glycosylation modifications; polypeptides or 
proteins expressed in yeast will have a glycosylation pattern different from that expressed in mammalian cells. 

"Nucleotide sequence" refers to a heteropolymerof deoxyribonucleotides. Generally, DNA segments encoding the 
polypeptides and proteins provided by this invention are assembled from fragments of the Staphylococcus aureus 
20 genome and short oligonucleotide linkers, or from a series of oligonucleotides, to provide a synthetic gene which is 
capable of being expressed in a recombinant transcriptional unit comprising regulatory elements derived from a mi- 
crobial or viral operon. 

"Recombinant expression vehicle or vector" referstoaplasmid or phage or virus or vector, for expressing a polypep- 
tide from a DNA (RNA) sequence. The expression vehicle can comprise a transcriptional unit comprising an assembly 

25 of (1 ) a genetic regulatory elements necessary for gene expression in the host, including elements required to initiate 
and maintain transcription at a level sufficient for suitable expression of the desired polypeptide, including, for example, 
promoters and, where necessary, an enhancers and a polyadenyfation signal; (2) a structural or coding sequence 
which is transcribed into mRNA and translated into protein, and (3) appropriate signals to initiate -translation at the 
beginning of the desired coding region and terminate translation at its end. Structural units intended for use in yeast 

30 6r eukaryotic expression systems preferably include a leader sequence^enabling extracellular secretion of translated 
protein by a host cell. Alternatively, where recombinant protein is expressed without a leader or transport sequence, 
it may include an N-terminal methionine residue. This residue may or may not be subsequently cleaved from the 
expressed recombinant protein to provide a final product. 

"Recombinant expression system" means host cells which- have stably integrated a recombinant transcriptional 

35 unit into chromosomal DNA or carry the recombinant transcriptional unit extra chromosomally. The cells can be prokary- 
otic or eukaryotic. Recombinant expression systems as defined herein will express heterologous polypeptides or pro- 
teins upon induction of the regulatory elements linked to the DNA segment or synthetic gene to be expressed. 

Mature proteins can be expressed in mammalian cells, yeast, bacteria, or other cells under the control of appro- 
priate promoters. Cell-free translation systems can also be employed to produce such proteins using RNAs derived 

40 from the DNA constructs of the present invention. Appropriate cloning and expression vectors for use with prokaryotic 
and eukaryotic hosts are described in Sambrook et al., MOLECULAR CLONING: A LABORATORY MANUAL, 2 nd Edi- 
tion, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York (1989), the disclosure of which is hereby 
incorporated by reference in its entirety. 

Generally, recombinant expression vectors will include origins of replication and selectable markers permitting 

45 transformation of the host cell, e.g., the ampicillin. resistance gene of E. coli and S. cerevisiae TRP1 gene, and a 
promoter derived from a highly expressed gene to direct transcription of a downstream structural sequence. Such 
promoters can be derived from operons encoding glycolytic enzymes such as 3-phosphoglycerate kinase (PGK), alpha- 
factor, acid phosphatase, or heat shock proteins, among others. The heterologous structural sequence is assembled 
in appropriate phase with translation initiation and termination sequences, and preferably, a leader sequence capable 

50 of directing secretion of translated protein into the periplasmic space or extracellular medium. Optionally, the heterol- 
ogous sequence can encode a fusion protein including an N-terminal identification peptide imparting desired charac- , 
teristics, e.g., stabilization or simplified purification of expressed recombinant product. 

Useful expression vectors for bacterial use are constructed by inserting a structural DNA sequence encoding a 
desired protein together with suitable translation initiation and termination signals in operable reading phase with a 

55 functional promoter. The vector will comprise one or more phenotypic selectable markers and an origin of replication 
to ensure maintenance of the vector and, when desirable, provide amplification within the host. 

Suitable prokaryotic hosts for transformation include strains of Staphylococcus aureus, E. coli, B. subtilis, Salmo- 
nella typhimurium and various species within the genera Pseudomonas, Streptomyces, and Staphylococcus. Others 



12 



EP0 786 519 A2 



may, also be employed as a matter of choice. 

As a representative but non-limiting example, useful expression vectors for bacterial use can comprise a selectable 
marker and bacterial origin of replication derived from commercially available plasmids comprising genetic elements 
of the well known cloning vector pBR322 (ATCC 37017). Such commercial vectors include, for example, pKK223-3 
5 (available form Pharmacia Fine Chemicals, Uppsala, Sweden) and GEM 1 (available from Promega Biotec, Madison, 
Wl, USA). These pBR322 "backbone" sections are combined with an appropriate promoter arid the structural sequence 
to be expressed. 

Following transformation of a suitable host strain and growth of the host strain to an appropriate cell density, the 
selected promoter, where it is inducible, is derepressed or induced by appropriate means {e.g., temperature shift or 
10 chemical induction) and cells are cultured for an additional period to provide for expression of the induced gene product. 
Thereafter cells are typically harvested, generally by cent rifu gat ion, disrupted to release expressed protein, generally 
by physical or chemical means, and the resulting crude extract is retained for further purification. 

Various mammalian cell culture systems can also be employed to express recombinant protein. Examples of mam- 
malian expression systems include the COS-7 lines of monkey kidney fibroblasts, described in Gluzman, Ceil 23: 175 
is (1981 ), and other cell lines capable of expressing a compatible vector, for example, the G127, 3T3, CHO, HeLa and 
BHK cell lines. 

Mammalian expression vectors will comprise an origin of replication, a suitable promoter and enhancer, and also 
any necessary ribosome binding sites, polyadenylation site, splice donor and acceptor sites, transcriptional termination 
sequences, and 5 1 flanking nontranscribed sequences. DNA sequences derived from the SV40 viral genome, for ex- 

20 ample, SV40 origin, early promoter, enhancer, splice, and polyadenylation sites may be used to provide the required 
nontranscribed genetic elements. 

Recombinant polypeptides and proteins produced in bacterial culture is usually isolated by initial extraction from 
cell pellets, followed by one or more salting-out, aqueous ion exchange or size exclusion chromatography steps. Mi- 
crobial cells employed in expression of proteins can be disrupted by any convenient method, including freeze-thaw 

25 cycling, sonication, mechanical disruption, or use of cell lysing agents. Protein refolding steps can be used, as neces- 
sary, in completing configuration of the mature protein. Finally, high performance liquid chromatography (HPLC) can 
be employed for final purification steps. 

An additional aspect of the invention includes Staphylococcus aureus polypeptides which are useful as immuno- 
diagnostic^antigens and/or immunoprotective vaccines, collectively 'immunologically useful polypeptides". Such im- 

30 munologically useful polypeptides may be selected from the ORFs disclosed herein based on techniques well known 
in the art and described elsewhere herein. The inventors have used the following criteria to select several immunolog- 
ically useful polypeptides: 

As is known in the art, an amino terminal type t signal sequence directs a nascent protein across the plasma and 
outer membranes to the exterior of the bacterial cell. Such outermembrane polypeptides are expected to be immuno- 

35 logically useful. According to Izard, J. W. et ah, Mol. Microbiol. 13, 765-773; (1994), polypeptides containing type I 
signal sequences contain the following physical attributes: The length of the type I signal sequence is approximately 
15 to 25 primarily hydrophobic amino acid residues with a net positive charge in the extreme amino terminus; the 
central region of the signal sequence must adopt an alpha-helical conformation in a hydrophobic environment; and the 
region surrounding the actual site of cleavage is ideally six residues long, with small side-chain amino acids in the -1 

40 and -3 positions. 

Also known in the art is the type IV signal sequence which is an example of the several types of functional signal 
sequences which exist in addition to the type I signal sequence detailed above. Although functionally related, the type 
IV signal sequence possesses a unique set of biochemical and physical attributes (Strom, M. S. and Lory, S., J. Bac- 
terid. 174, 7345-7351; 1992)). These are typically six to eight amino acids with a net basic charge followed by an 

45 additional sixteen to thirty primarily hydrophobic residues. The cleavage site of a type IV signal sequence is typically 
after the initial six to eight amino acids at the extreme amino terminus. In addition, all type IV signal sequences contain 
a phenylalanine residue at the +1 site relative to the cleavage site. 

Studies of the cleavage sites of twenty-six bacterial lipoprotein precursors has allowed the definition of a consensus 
amino acid sequence for lipoprotein cleavage. Nearly three-fourths of the bacterial lipoprotein precursors examined 

50 contained the sequence L-(A,S)-(G,A)-C at positions -3 to +1, relative to the point of cleavage (Hayashi, S. and Wu, 
H. C. Lipoproteins in bacteria. J Bioenerg. Biomembr. 22, 451-471; 1990). 

It well known that most anchored proteins found on the surface of gram-positive bacteria possess a highly con- 
served carboxy terminal sequence. More than fifty such proteins from organisms such as S. pyogenes, S. mutans, E. 
faecalts, S. pneumoniae, and others, have been identified based on their extracellular location and carboxy terminal 

55 amino acid sequence (Fischetti, V. A. Gram-positive commensal bacteria deliver antigens to elicit mucosal and systemic 
immunity. ASM News 62, 405410; 1 996). The conserved region is comprised of six charged amino acids at the extreme 
carboxy terminus coupled to 15-20 hydrophobic amino acids presumed to function as a transmembrane domain. Im- 
mediately adjacent to the transmembrane domain is a six amino acid sequence conserved in nearly all proteins ex- 
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amined. The amino acid sequence of this region is L-P-X-T-G-X, where X is any amino acid. 

Amino acid sequence similarities to proteins of known function by BLAST enables the assignment of putative 
functions to novel amino acid sequences and allows for the selection of proteins thought to function outside the cell 
wall. Such proteins are well known in the art and include "lipoprotein", "periplasmic", or "antigen". 

s An algorithm for selecting antigenic and im munogen ic Staphylococcus aureus poly peptides including the foregoing 

criteria was developed by the present inventors. Use of the algorithm by the inventors to select immunologically useful 
Staphylococcus aureus polypeptides resulted in the selection of several ORFs which are predicted to be outermem- 
brane-associated proteins. These proteins are identified in Table 4, below, and shown in the Sequence Listing as SEQ 
ID NOS:5,1 92 to 5,255. Thus the amino acid sequence of each of several anti gen \cStaphylococcus aureus polypeptides 

.10 listed in Table 4 can be determined, for example, by locating the amino acid sequence of the ORF in the Sequence 
Listing. Likewise the polynucleotide sequence encoding each ORF can be found by locating the corresponding poly- 
nucleotide SEQ ID in Tables 1 , 2, or 3, and finding the corresponding nucleotide sequence in the sequence listing. 

As will be appreciated by those of ordinary skill in the art, although a polypeptide representing an entire ORF may 
be the closest approximation to a protein found in vivo, it is not always technically practical to express a complete ORF 

is in vitro. It may be very challenging to express and purify a highly hydrophobic protein by common laboratory methods. 
As a result, the immunologically useful polypeptides described herein as SEQ ID NOS:5, 192-5,255 may have been 
modified slightly to simplify the production of recombinant protein, and are the preferred embodiments. In general, 
nucleotide sequences which encode highly hydrophobic domains, such as those found at the amino terminal signal 
sequence, are excluded for enhanced in vitro expression of the polypeptides. Furthermore, any highly hydrophobic 

20 amino acid sequences occurring at the carboxy terminus are also excluded. Such truncated polypeptides include for 
example the mature forms of the polypeptides expected to exist in nature. 

Those of ordinary skill in the art can identify soluble portions the polypeptide identified in Table 4, and in the case 
of truncated polypeptides sequences shown as SEQ ID NOS:5,1 92-5,255, may obtain the complete predicted amino 
acid sequence of each polypeptide by translating the corresponding polynucleotides sequences of the corresponding 

25 ORF listed in Tables 1 ,2 and 3 and found in the sequence listing. 

Accordingly, polypeptides comprising the complete amino acid of an immunologically useful polypeptide selected 
from the group of polypeptides encoded by the ORFs identified in Table 4, or an amino acid sequence at least 95% 
identical thereto, preferably at least 97% identical thereto, and most preferably at least 99% identical thereto form an 
embodiment of the invention; in addition polypeptides comprising an amino acid sequence selected from the group of 

30 amino acid sequences shown in the sequence listing as SEQ ID NOS:5, 191 -5,255, or an amino acid sequence at least 
95% identical thereto, preferably at least 97% identical thereto and most preferably at least 99% identical thereto, form 
an embodiment of the invention. Polynucleotides encoding the foregoing polypeptides also form part of the present 
invention. 

In another aspect, the invention provides a peptide or polypeptide comprising an epitope-bearing portion of a 

35 polypeptide of the invention, particularly those epitope-bearing portions (antigenic regions) identified in Table 4. The 
epitope-bearing portion is an immunogenic or antigenic epitope of a polypeptide of the invention. An "immunogenic 
epitope" is defined as a part of a protein that elicits an antibody response when the whole protein is the immunogen. 
On the other hand, a region of a protein molecule to which an antibody can bind is defined as an "antigenic epitope." 
the number of immunogenic epitopes of a protein generally is less than the number of antigenic epitopes. See, for 

40 instance, Geysen et at., Proc. Natl. Acad. Sci. USA 81 :3998- 4002 (1 983). 

As to the selection of peptides or polypeptides bearing an antigenic epitope (i.e., that contain a region of a protein 
molecule to which an antibody can bind), it is well known in that art that relatively short synthetic peptides that mimic 
part of a protein sequence are routinely capable of eliciting an antiserum that reacts with the partially mimicked protein. 
- See, for instance, Sutcliffe, J. G., Shinnick, T M., Green, N. and Learner, R. A. (1983) "Antibodies that react with 

45 predetermined sites on proteins", Science, 219:660-666. Peptides capable of eliciting protein -reactive sera are fre- 
quently represented in the primary sequence of a protein, can be characterized by a set of simple chemical rules, and 
are confined neither to immunodominant regions of intact proteins (i.e., immunogenic epitopes) nor to the amino or 
carboxyl terminals. Antigenic epitope-bearing peptides and polypeptides of the invention are therefore useful to raise 
antibodies, including monoclonal antibodies, that bind specifically to a polypeptide of the invention. See, for instance, 

so Wilson et al., Celt 37:767-778 (1 984) at 777. 

Antigenic epitope-bearing peptides and polypeptides of the invention preferably contain a sequence of at least 
seven, more preferably at least nine and most preferably between about 15 to about 30 amino acids contained within 
the amino acid sequence of a polypeptide of the invention. Non -limiting examples of antigenic polypeptides or peptides 
that can be used to generate S. aureus specific antibodies include: a polypeptide comprising peptides shown in Table 

ss 4 below. These polypeptide fragments have been determined to bear antigenic epitopes of indicated S. aureus proteins 
by the analysis of the Jameson-Wolf antigenic index, a representative sample of which is shown in Figure 3. 

The epitope-bearing peptides and polypeptides of the invention may be produced by any conventional means. 
See, e.g., Houghten, R. A. (1 985) General method for the rapid solid-phase synthesis of large numbers of peptides: 
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specificity of antigen-antibody interaction at the level of individual amino acids. Proc. Natl. Acad. Sci. USA 82: 
5131-5135; this "Simultaneous Multiple Peptide Synthesis (SMPS)" process is further described in U.S. Patent No. 
4,631,211 to Houghten et al. (1986). Epitope-bearing peptides and polypeptides of the invention are used to induce 
antibodies according to methods well known in the art. See, for instance, Sutcliffe et al., supra; Wilson et al., supra; 

s Chow, M. et al., Proc. Natl. Acad. Sci. USA 82:910-914; and Bittle, F. J. et al., J. Gen. Virol. 66:2347-2354 (1985). 

Immunogenic epitope-bearing peptides of the invention, i.e., those parts of a protein that elicit an antibody response 
when the whole protein is the immunogen, are identified according to methods known in the art. See, for instance, 
Geysen et al., supra. Further still, U.S. Patent No. 5,1 94,392 to Geysen (1 990) describes a general method of detecting 
or determining the sequence of monomers (amino acids or other compounds) which is a topological equivalent of the 

10 epitope (i.e., a "mimotope") which is complementary to a particular paratope (antigen binding site) of an antibody of 
interest. More generally, U.S. Patent No. 4,433,092 to Geysen (1989) describes a method of detecting or determining 
a sequence of monomers which is a topographical equivalent of a ligand which is complementary to the ligarid binding 
site of a particular receptor of interest. Similarly, U.S. Patent No. 5,480,971 to Houghten, R. A. et al. (1996) on Per- 
alkylated Oligopeptide Mixtures discloses linear C1 -C7-alkyl peralkylated oligopeptides and sets and libraries of such 

15 peptides, as well as methods for using such oligopeptide sets and libraries for determining the sequence of a per- 
alkylated oligopeptide that preferentially, binds to an acceptor molecule of interest. Thus, non-peptide analogs of the 
epitope-bearing peptides of the invention also can be made routinely by these methods. 

Table 4 lists immunologically useful polypeptides identified by an algorithm which locates novel Staphylococcus 
aureus outermembrane proteins, as is described above. Also listed are epitopes or "antigenic regions" of each of the 

20 identified polypeptides. The antigenic regions, or epitopes, are delineated by two numbers x-y, where x is the number 
of the first amino acid in the open reading frame included within the epitope and y is the number of the last amino acid 
in the open reading frame included within the epitope. For example, the first epitope in ORF 168-6 is comprised of 
amino acids 36 to 45 of SEQ ID NO:5,192, as is described in Table 4. The inventors have identified several epitopes 
for each of the antigenic polypeptides identified in Table 4. Accordingly, forming part of the present invention are 

25 polypeptides comprising an amino acid sequence of one or more antigenic regions identified in Table 4. The invention* 
further provides polynucleotides encoding such polypeptides. .'-V ^ 

The present invention further includes isolated polypeptides, proteins and nucleic acid molecules which are sub- ? 
stantially. equivalent to those herein described. As used herein, substantially equivalent can refer both to nucleic acid * 
and amino acid sequences, for example a mutant sequence, that varies from a reference sequence by one or more 7 * 

30 substitutions.Vdeletions, or additions, the net effect of which does not result in an adverse functional dissimilarity be-^ 
tween reference and subject sequences. For purposes of the present invention, sequences having equivalent biological 
activity, and equivalent expression characteristics are considered substantially equivalent. For purposes of determining 
equivalence, truncation of the mature sequence should be disregarded. 

The invention further provides methods of obtaining homologs from other strains of Staphylococcus aureus, of the 

35 fragments of .the Staphylococcus aureus genome of the present invention and homologs of the proteins encoded by 
the ORFs of the present invention. As used herein, a sequence or protein of Staphylococcus aureus is defined as a 
homolog of a fragment of the Staphylococcus aureus fragments or contigs or a protein encoded by one of the ORFs 
of the present invention, if it shares significant homology to one of the fragments of the Staphylococcus aureus genome 
of the present invention or a protein encoded by one of the ORFs of the present invention Specifically, by using the 

40 sequence disclosed herein as a probe or as primers, and techniques such as PGR cloning and colony/plaque hybrid- 
ization, one skilled in the art can obtain homologs. 

As used herein, two nucleic acid molecules or proteins are said to "share significant homology" if the two contain 
regions which prossess greater than 85% sequence (amino acid or nucleic acid) homology. Preferred homologs in this 
regard are those with more than 90% homology. Especially preferred are those with 93% or more homology. Among 

45 especially preferred homologs those with 95% or more homology are particularly preferred. Very particularly preferred 
among these are those with 97% and even more particularly preferred among those are homologs with 99% or more 
homology. The most preferred homologs among these are those with 99.9% homology or more. It will be understood 
that, among measures of homology, identity is particularly preferred in this regard. 

Region specific primers or probes derived from the nucleotide sequence provided in SEQ ID NOS:1-5,191 or from 

50 a nucleotide sequence at least 95%, particularly at least 99%, especially at least 99.5% identical to a sequence of SEQ 
ID NOS:1-5,191 can be used to prime DNA synthesis and PCR amplification, as well as to identify colonies containing 
cloned DNA encoding a homolog. . Methods suitable to this aspect of the present invention are well known and have 
been described in great detail in many publications such as, for example, Innis etai, PGR PROTOCOLS, Academic 
Press, San Diego, CA (1990)). 

55 When using primers derived from SEQ ID NOS:1-5,191 or from a nucleotide sequence having an aforementioned 

identity to a sequence of SEQ ID NOS:1-5,191 , one skilled in the art will recognize that by employing high stringency 
conditions (e.g., annealing at 50-60°C in 6X SSPC and 50% formamide, and washing at 50- 65°C in 0.5X SSPC) only 
sequences which are greater than 75% homologous to the primer will be amplified. By employing lower stringency 
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conditions (e.g., hybridizing at 35-37°C in 5X SSPC and 40-45% formamide, and washing at 42°C in 0.5X SSPC), 
sequences which are greater than 40-50% homologous to the primer will also be amplified. 

When using DNA probes derived from SEQ ID NOS:1 -5,191 > or from a nucleotide sequence having an aforemen- 
tioned identity to a sequence of SEQ ID NOS:1 -5, 1 91 , for colony/plaque hybridization, one skilled in the art will recog- 

s nize that by employing high stringency conditions/e.g., hybridizing at 50- 65°C in 5X SSPC and 50% formamide, and 
washing at 50- 65°C in 0.5X SSPC), sequences having regions which are greater than 90% homologous to the probe 
can be obtained, and that by employing lower stringency conditions (e.g., hybridizing at 35-37°C in 5X SSPC and 
40-45% formamide, and washing at 42°C in 0.5X SSPC), sequences having regions which are greater than 35-45% 
homologous to the probe will be obtained. 

10 Any organism can be used as the source for homologs of the present invention so long as the organism naturally 

expresses such a protein or contains genes encoding the same. The most preferred organism for isolating homologs 
are bacterias which are closely related to Staphylococcus aureus. 

ILLUSTRATIVE USES OF COMPOSITIONS OF THE INVENTION 

15 

Each ORF provided in Tables 1 and 2 is identified with a function by homojogy to a known gene or polypeptide. 
As a result, one skilled in the art can use the polypeptides of the present invention for commercial, therapeutic and 
industrial purposes consistent with the type of putative identification of the polypeptide. Such identifications permit one 
skilled in the art to use the Staphylococcus aureusORFs in a manner similar to the known type of sequences for which 

20 the identification is made; for example, to ferment a particular sugar source or to produce a particular metabolite. A 
variety of reviews illustrative of this aspect of the invention are available, including the following reviews on the industrial 
use of enzymes, for example, BIOCHEMICAL ENGINEERING AND BIOTECHNOLOGY HANDBOOK, 2nd Ed., Mac- 
millan Publications, Ltd. NY (1991) and BIOCATALYSTS IN ORGANIC SYNTHESES, Tramper et al. r Eds., Elsevier 
Science Publishers, Amsterdam, The Netherlands (1 985). A variety of exemplary uses that illustrate this and similar 

2S aspects of the present invention are discussed below. 

1 . Biosynthetic Enzymes 

Open reading frames encoding proteins involved in mediating the catalytic reactions involved in intermediary and 
30 , macromolecular metabolism, the biosynthesis of small molecules,;Cellular processes and other functions includes en- 
zymes involved in the degradation of the intermediary products of metabolism, enzymes involved in central intermediary 
metabolism, enzymes involved in respiration, both aerobic and anaerobic, enzymes involved in fermentation, enzymes 
involved in ATP proton motor force conversion, enzymes involved in broad regulatory function,, enzymes involved in 
amino acid synthesis,- enzymes involved in nucleotide synthesis, enzymes involved in cofactor and vitamin synthesis, 
35 can be used for industrial biosynthesis. 

The various metabolic pathways present in Staphylococcus aureus can be identified based on absolute nutritional 
requirements as well as by examining the various enzymes identified in Table 1 -3 and SEQ ID NOS:1-5,191 

Of particular interest are polypeptides involved in the degradation of intermediary metabolites as well as non- 
macromolecular metabolism. Such enzymes include amylases, glucose oxidases, and catalase. 
40 Proteolytic enzymes are another class of commercially important enzymes. Proteolytic enzymes find use in a 

number of industrial processes including the processing of flax and other vegetable fibers, in the extraction, clarification 
and depectinization of fruit juices, in the extraction of vegetables' oil and in the maceration of fruits and vegetables to 
give unicellular fruits. A detailed review of the proteolytic enzymes used in the food industry is provided in Rombouts 
etal, Symbiosis2V. 79 (1 986) and Voragen era/, in BIOCATALYSTS IN AGRICULTURAL BIOTECHNOLOGY, Whitak- 
45 er et al, Eds., American Chemical Society Symposium Series 389: 93 (1 989) 

The metabolism of sugars is an important aspect of the primary metabolism of Staphylococcus aureus. Enzymes 
involved in the degradation of sugars, such as, particularly, glucose, galactose, fructose and xylose, can be used in 
industrial fermentation. Some of the important sugar transforming enzymes, from a commercial viewpoint, include 
sugar isomerases such as glucose isomerase. Other metabolic enzymes have found commercial use such as glucose 
so oxidases which produces ketogulonic acid (KGA). KGA is an intermediate in the commercial production of ascorbic 
acid using the Reichstein's procedure, as described in Krueger etal, Biotechnology 6(A), Rhine etal, Eds., Verlag 
Press, Weinheim, Germany (1984). 

Glucose oxidase (GOD) is commercially available and has been used in purified form as well as in an immobilized 
form for the deoxygenation of beer. See, for instance, Hartmeir er al., Biotechnology Letters V. 21 (1979). The most 
55 important application of GOD is the industrial scale fermentation of gluconic acid. Market for gluconic acids which are 
used in the detergent, textile, leather, photographic, pharmaceutical, food, feed and concrete industry, as described, 
for example, in Bigelis etal., beginning on page 357 in GENE MANIPULATIONS AND FUNGI; Benett etal., Eds., 
Academic Press, New York (1985). In addition to industrial applications, GOD has found applications in medicine for 
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quantitative determination of glucose in body fluids recently in biotechnology for analyzing syrups from starch and 
cellulose hydrosylates. This application is described in Owusu et at., Biochem. et Biophysics. Acta. 872 : 83 (1 986), for 
instance. 

The main sweetener used in the world today is sugar which comes from sugar beets and sugar cane. In the field 

5 of industrial enzymes, the glucose isomerase process shows the largest expansion in the market today. Initially, soluble 
enzymes were used and later immobilized enzymes were developed (Krueger et at., Biotechnology, The Textbook of 
Industrial Microbiology, Sinauer Associated Incorporated, Sunderland, Massachusetts (1990)). Today, the use of glu- 
cose- produced high fructose syrups is by far the largest industrial business using immobilized enzymes. A review of 
the industrial use of these enzymes is provided by Jorgensen, Starch 40:307 (1988). 

10 Proteinases, such as alkaline serine proteinases, are used as detergent additives and thus represent one of the 

largest volumes of microbial enzymes used in the industrial sector. Because of their industrial importance, there is a 
large body of published and unpublished information regarding the use of these enzymes in industrial processes. (See 
Faultman et at., Acid Proteases Structure Function and Biology, Tang, J., ed., Plenum Press, New York (1977) and 
Godfrey etaf., Industrial Enzymes, MacMillan Publishers, Surrey, UK (1983) and Hepner etat., Report industrial En- 

15 zymes by 1990, Hel Hepner & Associates, London (1986)). 

Another class of commercially usable proteins of the present invention are the microbial lipases, described by, for 
instance, Macrae etaf., Philosophical Transactions of the Chiral Society of London 310:227 (1 985) and Poserke, Jour- 
nal of the American Oil Chemist Society 61:1758 (1984). A major use of lipases is in the fat and oil industry for the 
production of neutral glycerides using lipase catalyzed inter-esterification of readily available triglycerides. Application 

20 of lipases include the use as a detergent additive to facilitate the removal of fats from fabrics in the course of the 
washing procedures. 

The use of enzymes, and in particular microbial enzymes, as catalyst for key steps in the synthesis of complex 
organic molecules is gaining popularity at a great rate. One area of great interest is the preparation of chiral interme- 
diates. Preparation of chiral intermediates is of interest to a wide range of synthetic chemists particularly those scientists 

25 involved with the preparation of new pharmaceuticals, agrochemicals, fragrances and flavors. (See Davies etat., Re- J 
cent Advances in the Generation of Chiral Intermediates Using Enzymes, CRC Press, Boca Raton, Florida (1990)). 
The following reactions catalyzed by enzymes are of interest to organic chemists:hydrolysis of carboxylic acid esters, 
phosphate esters, amides and nitriles, esterification reactions, trans-esterification reactions, synthesis of amides, re- 
duction of alkanones and oxoalkanates, oxidation of alcohols to carbonyl compounds, oxidation of sulfides to sulfoxides, " 

30 and carbbrvbond forming reactions such as the aldol reaction. " vf 
When considering the use of an enzyme encoded by one of the ORFs of the present invention for biotransformation 
and organic synthesis it is sometimes necessary to consider the respective advantages and disadvantages of using a ; 
microorganism as opposed to an isolated enzyme. Pros and cons of using a whole cell system on the one hand or arr 
isolated partially purified enzyme on the other hand, has been described in detail by Bud et at, Chemistry in Britain ' 

35 (1987), p. 127. 

Amino transferases, enzymes involved in the biosynthesis and metabolism of amino acids, are useful in the catalytic 
production of amino acids. The advantages of using microbial based enzyme systems is that the amino transferase 
enzymes catalyze the stereo- selective synthesis of only L-amino acids and generally possess uniformly high catalytic 
rates. A description of the use of amino transferases for amino acid production is provided by Rose lie-David, Methods 
40 of EnzvmolQQV 136:479 (1 987). 

Another category of useful proteins encoded by the ORFs of the present invention include enzymes involved in 
nucleic acid synthesis, repair, and recombination. A variety of commercially important enzymes have previously been 
isolated from members of Staphyiococcus aureus. These include Sau3A and Sau96l. 

45 2. Generation of Antibodies 

As described here, the proteins of the present invention, as well as homologs thereof, can be used in a variety 
procedures and methods known in the art which are currently applied to other proteins. The proteins of the present 
invention can further be used to generate an antibody which selectively binds the protein. Such antibodies can be 
50 either monoclonal or polyclonal antibodies, as well fragments of these antibodies, and humanized forms. 

The invention further provides antibodies which selectively bind to one of the proteins of the present invention and 
hybridomas which produce these antibodies. A hybridoma is an immortalized cell line which is capable of secreting a 
specific monoclonal antibody. 

In general, techniques for preparing polyclonal and monoclonal antibodies as well as hybridomas capable of pro- 
55 ducing the desired antibody are well known in the art (Campbell, A. M., MONOCLONAL ANTIBODY TECHNOLOGY: 
LABORATORY TECHNIQUES IN BIOCHEMISTRY AND MOLECULAR BIOLOGY, Elsevier Science Publishers, Am- 
sterdam, The Netherlands (1984); St. Groth etat., J. Immunol. Methods 35: 1-21 (1980), Kohler and Milstein, Nature 
256 : 495-497 (1 975)), the trioma technique, the human B- cell hybridoma technique (Kozbor etat., Immunology Today 
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4: 72 (1983), pgs. 77-96 of Cole et al, in MONOCLONAL ANTIBODIES AND CANCER THERAPY, Alan R. Liss, Inc. 
(1985)). 

Any animal (mouse, rabbit, etc.) which is known to produce antibodies can be immunized with the pseudogene 
polypeptide. Methods for immunization are well known in the art. Such methods include subcutaneous or interperitoneal 
5 injection of the polypeptide. One skilled in the art will recognize that the amount of the protein encoded by the ORF of 
the present invention used for immunization will vary based on the animal which is immunized, the antigenicity of the 
peptide and the site of injection. 

The protein which is used as an immunogen may be modified or administered in an adjuvant in order to increase 
the protein's antigenicity. Methods of increasing the antigenicity of a protein are well known in the art and include, but 
10 are not limited to coupling the antigen with a heterologous protein (such as globulin or galactosidase) or through the 
inclusion of an adjuvant during immunization., 

For monoclonal antibodies, spleen cells from the immunized animals are removed, fused with myeloma cells, such 
as SP2/0-Ag14 myeloma cells, and allowed to become monoclonal antibody producing hybridoma ceils. 

Any one of a number of methods well known in the art can be used to identify the hybridoma cell which produces 
is an antibody with the desired characteristics. These include screening the hybridomas with an ELISA assay, western 
blot analysis, or radioimmunoassay (Lutz et al, Exp. Ceil Res. 175: 1 09-1 24 (1 988)). 

Hybridomas secreting the desired antibodies are cloned and the class and subclass is determined using procedures 
known in the art (Campbell, A. M., Monoclonal Antibody Technology: Laboratory Techniques in Biochemistry and Mo- 
lecular Biology, Elsevier Science Publishers, Amsterdam, The Netherlands (1 984)). 
20 Techniques described for the production of single chain antibodies (U. S. Patent 4,946,778) can be adapted to 

produce single chain antibodies to proteins of the present invention. 

For polyclonal antibodies, antibody containing antisera is isolated from the immunized animal and is screened for 
the presence of antibodies with the desired specificity using one of the above-described procedures. 

The present invention further provides the above- described antibodies in detectably labelled form. Antibodies can 
25 be detectably labelled through the use of radioisotopes, affinity labels (such as biotin, avidin, etc.), enzymatic labels 
(such as horseradish peroxidase, alkaline phosphatase, etc.) fluorescent labels (such as FITC or rhodamine, etc.), 
paramagnetic atoms, etc. Procedures for accomplishing such labelling are well-known in the art, for example see 
Sternberger et al, J. Histochem. Cytochem. 18:315 (1970); Bayer, E. A. etal, Meth. Enzym. 62:308 (1979); Engval, 
E. et al, Immunol. 109:129 (1972); Goding, J. W. J. Immunol. Meth. 13:215 (1976)). 
30 . The labeled antibodies of the present invention can be used for in vitro- in vivo; and in situ assays to identify cells 

or tissues in which a fragment of the Staphylococcus aureus genome is expressed. 

The present invention further provides the above-described antibodies immobilized on a solid support. Examples 
of such solid supports include plastics such as polycarbonate, complex carbohydrates such as agarose and sepharose, 
acrylic resins and such as polyacrylamide and latex beads. Techniques for coupling antibodies to such solid supports 
35 are well known in the art (Weir, D. M. et al, "Handbook of Experimental Immunology- 4th Ed., Blackwell Scientific 
Publications, Oxford, England, Chapter 10 (1986); Jacoby, W. D. era/., Meth. Enzym. 34 Academic Press, N. Y. (1974)). 
The immobilized antibodies of the present invention can be used for in vitro, in vivo, and in situ assays as well as for 
immunoaffinity purification of the proteins of the present invention. 

40 3. Diagnostic Assays and Kits 

The present invention further provides methods to identify the expression of one of the ORFs of the present in- 
vention, or homolog thereof, in a test sample, using one of the DFs,antigens or antibodies of the present invention. 
In detail, such methods comprise incubating a test sample with one or more of the antibodies, or one or more of 
45 the DFs, or one or more antigens of the present invention and assaying for binding of the DFs, antigens or antibodies 
to components within the test sample. 

Conditions for incubating a DF, antigen or antibody with a test sample vary. Incubation conditions depend on the 
format employed in the assay, the detection methods employed, and the type and nature of the DF or antibody used 
in the assay. One skilled in the art will recognize that any one of the commonly available hybridization, amplification 
50 or immunological assay formats can readily be adapted to employ the Dfs, antigens or antibodies of the present in- 
vention. Examples of such assays can be found in Chard, T, An Introduction to Radioimmunoassay and Related 
Techniques, Elsevier Science Publishers, Amsterdam, The Netherlands (1986); Bullock, G. R. et al, Techniques in 
Immunocytochemistry, Academic Press, Orlando, FL Vol. 1 (1982), Vol. 2 (1983), Vol. 3 (1985); Tijssen, P., Practice 
and Theory of Enzyme Immunoassays: Laboratory Techniques in Biochemistry; PCT publication W095/32291 , and 
55 Molecular Biology, Elsevier Science Publishers, Amsterdam, The Netherlands (1 985), all of which are hereby incorpo- 
. rated herein by reference. 

The test samples of the present invention include cells, protein or membrane extracts of cells, or biological fluids 
such as sputum, blood, serum, plasma, or urine. The test sample used in the above<lescribed method will vary based 
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on the assay format, nature of the detection method and the tissues, cells or extracts used as the sample to be assayed. 
Methods for preparing protein extracts or membrane extracts of cells are well known in the art and can be readily be 
adapted in order to obtain a sample which is compatible with the system utilized. 

In another embodiment of the present invention, kits are provided which contain the necessary reagents to carry 
s out the assays of the present invention. 

Specifically, the invention provides a compartmentalized kit to receive, in close confinement, one or more containers 
which comprises: (a) a first container comprising one of the Dfs, antigens or antibodies of the present invention; and 
(b) one or more other containers comprising one or more of the foltowing:wash reagents, reagents capable of detecting 
presence of a bound DF, antigen or antibody. 
10 in detail, a compartmentalized kit includes any kit in which reagents are contained in separate containers. Such 

containers include small glass containers, plastic containers or strips of plastic or paper. Such containers allows one 
to efficiently transfer reagents from one compartment to another compartment such that the samples and reagents are 
not cross-contaminated, and the agents or solutions of each container can be added in a quantitative fashion from one 
compartment to another. Such containers will include a container which will accept the test sample, a container which 
*5 contains the antibodies used in the assay, containers which contain wash reagents (such as phosphate buffered saline, 
Tris-buffers, etc.), and containers which contain the reagents used to detect the bound antibody, antigen or DF. 

Types of detection reagents include labelled nucleic acid probes, labelled secondary antibodies, or in the alterna- 
tive, if the primary antibody is labelled, the enzymatic, or antibody binding reagents which are capable of reacting with 
the labelled antibody. One skilled in the art will readily recognize that the disclosed Dfs, antigens and antibodies of the 
20 present invention can be readily incorporated into one of the established kit formats which are well known in the art. 

4. Screening Assay for Binding Agents 

Using the isolated proteins of the present invention, the present invention further provides methods of obtaining 
25 and identifying agents which bind to a protein encoded by one of the ORFs of the present invention or to one of the 
fragments and the Staphylococcus aureus fragment and contigs herein described. 
In general, such methods comprise steps of: 

(a) contacting an agent with an isolated protein encoded by one of the ORFs of the present invention, or an isolated 
30 ( fragment:of the Staphylococcus aureus genome; and 

(b) determining whether the agent binds to said protein or said fragment. 

The agents screened in the above assay can be, but are not limited to, peptides, carbohydrates, vitamin derivatives, 
or other pharmaceutical agents. The agents can be selected and screened at random or rationally selected or designed 
35 using protein modeling techniques. 

For random screening, agents such as peptides, carbohydrates, pharmaceutical agents and the like are selected 
at random and are assayed for their ability to bind to the protein encoded by the ORF of the present invention. 

Alternatively, agents may be rationally selected or designed. As used herein, an agent is said to be "rationally 
. selected or designed* when the agent is chosen based on the configuration of the particular protein. For example, one 
40 skilled in the art can readily adapt currently available procedures to generate peptides, pharmaceutical agents and the 
like capable of binding to a specific peptide sequence in order to generate rationally designed antipeptide peptides, 
for example see Hurby et al, Application of Synthetic Peptides: Antisense Peptides, 1 * In Synthetic Peptides, A User's 
Guide, W. H. Freeman, NY (1992), pp. 289-307, and Kaspczak era/., Biochemistry 28:9230-8 (1989), or pharmaceutical 
agents, or the like. 

45 |n addition to the foregoing, one class of agents of the present invention, as broadly described, can be used to 

control gene expression through binding to one of the ORFs or EMFs of the present invention. As described above, 
such agents can be randomly screened or rationally designed/selected. Targeting the ORF or EMF allows a skilled 
artisan to design sequence specific or element specific agents, modulating the expression of either a single ORF or 
multiple ORFs which rely on the same EMF for expression control. 

50 One class of DNA binding agents are agents which contain base residues which hybridize or form a triple helix by 

binding to DNA or RNA. Such agents can be based on the classic phosphodiester, ribonucleic acid backbone, or can 
be a variety of sulfhydryl or polymeric derivatives which have base attachment capacity. 

Agents suitable for use in these methods usually contain 20 to 40 bases and are designed to be complementary 
to a region of the gene involved in transcription (triple helix - see Lee eta!., Nucl. Acids Res. 6:3073 (1979); Cooney 

55 et al, Science 241 :456 (1 988); and Dervan et ai, Science 251 : 1 360 (1 991 )) or to the rnRNA itself (antisense - Okano, 
J. Neurochem. 56:560 (1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 
Raton, FL (1988)). Triple helix-formation optimally results in a shutoff of RNA transcription from DNA, while antisense 
RNA hybridization blocks translation of an rnRNA molecule into polypeptide. Both techniques have been demonstrated 
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to be effective in model systems. Information contained in the sequences of the present invention can be used to design 
antisense and triple helix-forming oligonucleotides, and other DNA binding agents. 

5. Pharmaceutical Compositions and Vaccine 

5 

The present invention further provides pharmaceutical agents which can be used to modulate the growth or path- 
ogenicity of Staphylococcus aureus, or another related organism, in vivo or in vitro. As used herein, a "pharmaceutical 
agent" is defined as a composition of matter which can be formulated using known techniques to provide a pharma- 
ceutical compositions. As used herein, the "pharmaceutical agents of the present invention- refers the pharmaceutical 

io agents which are derived from the proteins encoded by the ORFs of the present invention or are agents which are 
identified using the herein described assays. . , . ., 

As used herein, a pharmaceutical agent is said to "modulate the growth or pathogenicity of Staphylococcus aureus 
or a related organism, in vivo or in vitro, "when the agent reduces the rate of growth, rate of division, or viability of the 
organism in question. The pharmaceutical agents of the present invention can modulate the growth or pathogenicity 

is of an organism in many fashions, although an understanding of the underlying mechanism of action is not needed to 
practice the use of the pharmaceutical agents of the present invention. Some agents will modulate the growth or path- 
ogenicity by binding to an important protein thus blocking the biological activity of the protein, whjle other agents may 
bind to a component of the outer surface of the organism blocking attachment or rendering the organism more prone 
to act the bodies nature immune system. Alternatively, the agent may comprise a protein encoded by one of the ORFs 

20 of the present invention and serve as a vaccine. The development and use of vaccines derived from membrane asso- 
ciated polypeptides are well known in the art. The inventors have identified particularly preferred immunogenic Sta- 
phylococcus aureus polypeptides for use as vaccines. Such immunogenic polypeptides are described above and sum- 
marized in Table 4, below. 

As used herein, a "related organism" is a broad term which refers to any organism whose growth or pathogenicity 
25 can be modulated by one of the pharmaceutical agents of the present invention. In general, such an organism will 
contain a homolog of the protein which is the target of the pharmaceutical agent or the protein used as a vaccine. As 
such, related organisms do not need to be bacterial but may be fungal or viral pathogens. . 

The pharmaceutical agents and compositions of the present invention may be administered in a convenient man- 
ner, such as by the oral, topical, intravenous, intraperitoneal, intramuscular, subcutaneous, intranasal or intradermal 
30 routes. The pharmaceutical compositions are administered in an amount which is effective for treating and/or proph- 
ylaxis of the specific indication. In general, they are administered in an amount of at least about 1 mg/kg body weight 
and in most cases they will be administered in an amount not in excess of about 1 g/kg body weight per day. In most • 
cases, the dosage is from about 0.1 mg/kg to about 10 g/kg body weight daily, taking into account the routes of ad- 
ministration, symptoms, etc. 

.35 The agents of the present invention can be used in native form or can be modified to form a chemical derivative. 

As used herein, a molecule is said to be a "chem ical derivative" of another molecule when it contains additional chemical 
moieties not normally a part of the molecule. Such moieties may improve the molecule's solubility, absorption, biological 
half life, etc. The moieties may alternatively decrease the toxicity of the molecule, eliminate or attenuate any undesirable 
side effect of the molecule, etc. Moieties capable of mediating such effects are disclosed in, among other sources, 

40 REMINGTON'S PHARMACEUTICAL SCIENCES (1 980) cited elsewhere herein. 

For example, such moieties may change an immunological character of the functional derivative, such as affinity 
for a given antibody. Such changes in immunomodulation activity are measured by the appropriate assay, such as a 
competitive type immunoassay. Modifications of such protein properties as redox or thermal stability, biological half- 
life, hydrophobicity, susceptibility to proteolytic degradation or the tendency to aggregate with carriers or into multimers 

45 also may be effected in this way and can be assayed by methods well known to the skilled artisan. 

The therapeutic effects of the agents of the present invention may be obtained by providing the agent to a patient 
by any suitable means (e.g., inhalation, intravenously, intramuscularly, subcutaneously, enterally, or parenteratly). It is 
preferred to administer the agent of the present invention so as to achieve an effective concentration within the blood 
or tissue in which the growth of the organism is to be controlled. To achieve an effective blood concentration, the 

so preferred method is to administer the agent by injection. The administration may be by continuous infusion, or by single 
or multiple injections. 

In providing a patient with one of the agents of the present invention, the dosage of the administered agent will 
vary depending upon such factors as the patient's age, weight, height, sex, general medical condition, previous medical 
history, etc. In general, it is desirable to provide the recipient with a dosage of agent which is in the range of from about 
55 1 pg/kg to 10 mg/kg (body weight of patient), although a lower or higher dosage may be administered. The therapeu- 
tically effective dose can be lowered by using combinations of the agents of the present invention or another agent. 

As used herein, two or more compounds or agents are said to be administered "in combination" with each other 
when either (1) the physiological effects of each compound, or (2) the serum concentrations of each compound can 
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be measured at the same time. The composition of the present invention can be administered concurrently with, prior 
to, or following the administration of the other agent. 

The agents of the present invention are intended to be provided to recipient subjects in an amount sufficient to 
decrease the rate of growth (as defined above) of the target organism. 
5 The administration of the agent(s) of the invention may be for either a "prophylactic" or "therapeutic" purpose. 

When provided prophylactically, the agent(s) are provided in advance of any symptoms indicative of the organisms 
growth. The prophylactic administration of the agent(s) serves to prevent, attenuate, or decrease the rate of onset of 
any subsequent infection. When provided therapeutically, the agent(s) are provided at (or shortly after) the onset of an 
indication of infection. The therapeutic administration of the compound(s) serves to attenuate the pathological symp- 

10 toms of the infection and to increase the rate of recovery. 

The agents of the present invention are administered to a subject, such as a mammal, or a patient, in a pharma- 
ceutical^ acceptable form and in a therapeutically effective concentration. A composition is said to be "pharmacolog- 
ically acceptable" if its administration can be tolerated by a recipient patient. Such an agent is said to be administered 
in a "therapeutically effective amount" if the amount administered is physiologically significant. An agent is physiolog- 

1$ ically significant if its presence results in a detectable change in the physiology of a recipient patient. 

The agents of the present invention can be formulated according to known methods to prepare pharmaceutical^ 
useful compositions, whereby these materials, or their functional derivatives, are combined in admixture with a phar- 
maceutically acceptable carrier vehicle. Suitable vehicles and their formulation, inclusive of other human proteins, e. 
g., human serum albumin, are described, for example, in REMINGTON'S PHARMACEUTICAL SCIENCES, 16 th Ed., 

20 Osol, A., Ed., Mack Publishing, Easton PA (1980). In order to form a pharmaceutical^ acceptable composition suitable 
for effective administration, such compositions will contain an effective amount of one or more of the agents of the 
present invention, together with a suitable amount of carrier vehicle. 

Additional pharmaceutical methods maybe employed to control the duration of action. Control release preparations 
may be achieved through the use of polymers to complex or absorb one or more of the agents of the present invention. 

25 The controlled delivery may be effectuated by a variety of well known techniques, including formulation with macro- 
molecules such as, for example, polyesters, polyamino acids, polyvinyl, pyrroiidone, ethylenevinylacetate, methylcel- 
lulose, carboxymethylcellulose, or protamine, sulfate, adjusting the concentration of the macromolecules and the agent 
in the formulation, and by appropriate use of methods of incorporation, which can be manipulated to effectuate a desired 
time course of; release. Another possible method to control the duration of action by controlled release preparations is 

30 to incorporate- agents of the present invention into particles of a polymeric material such as polyesters, polyamino. 
acids, hydrogels, poly(lactic acid) or ethylene vinylacetate copolymers. Alternatively, instead of incorporating these 
agents into polymeric particles, it is possible to entrap these materials in microcapsules prepared, for example, by 
coacervation4echniques or by interfacial polymerization with, for example, hydroxymethylcellulose or gelatine -micro- 
capsules and poly(methylmethacylate) microcapsules, respectively, or in colloidal drug delivery systems, for example, 

35 liposomes, albumin microspheres, microemulsions, nanoparticles, and nanocapsules or in macroemulsions. Such tech- 
niques are disclosed in REMINGTON'S PHARMACEUTICAL SCIENCES (1980). 

The invention further provides a pharmaceutical pack or kit comprising one or more containers filled with one or 
more of the ingredients of the pharmaceutical compositions of the invention. Associated with such containers) can be 
a notice in the form prescribed by a governmental agency regulating the manufacture, use or sale of pharmaceuticals 

40 or biological products, which notice reflects approval by the agency of manufacture, use or sale for human adminis- 
tration. 

In addition, the agents of the present invention may be employed in conjunction with other therapeutic compounds. 
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6. Shot-Gun Approach to Megabase DNA Sequencing 



The present invention further demonstrates that a large sequence can be sequenced using a random shotgun 
approach. This procedure, described in detail in the examples that follow, has eliminated the up front cost of isolating 
and ordering overlapping or contiguous subclones prior to the start of the sequencing protocols. 

Certain aspects of the present invention are described in greater detail in the examples that follow. The examples 
50 are provided by way of illustration. Other aspects and embodiments of the present invention are contemplated by the 
inventors, as will be clear to those of skill in the art from reading the present disclosure. 
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ILLUSTRATIVE EXAMPLES 
LIBRARIES AND SEQUENCING 

1. Shotgun Sequencing Probability Analysis 

The overall strategy for a shotgun approach to whole genome sequencing follows from the Lander and Waterman 
(Landerman and Waterman, Genomics 2: 231 (1988)) application of the equation for the Poisson distribution. According 
to this treatment, the probability, P 0 , that any given base in a sequence of size L, in nucleotides, is not sequenced after 
a certain amount, n, in nucleotides, of random sequence has been determined can be calculated by the equation P 0 
= e-" where m is Lm, the fold coverage " For instance, for a genome of 2.8 Mb, m=1 when 2.8 Mb of sequence has 
been randomly generated (1X coverage). At that point, P 0 = e-1 = 0.37. The probability that any given base has not 
been sequenced is the same as the probability that any region of the whole sequence L has not been determined and, 
therefore, is equivilent to the fraction of the whole sequence that has yet to be determined. Thus, at onefold coverage, 
approximately 37% of a polynucleotide of size L, in nucleotides has not been sequenced. When 14 Mb of sequence 
has been generated, coverage is 5X for a .2.8 Mb and the unsequenced fraction drops to .0067 or 0.67%. 5X coverage 
of a 2.8 Mb sequence can be attained by sequencing approximately 17,000 random clones from both insert ends with 
an average sequence read length of 41 0 bp. 

Similarly, the total gap length, G, is determined by the equation G = Le' m , and the average gap size, g, follows the 
equation, g = L/n. Thus, 5X coverage leaves about 240 gaps averaging about 82 bp in size in a sequence of a poly- 
nucleotide 2.8 Mb long. 

The treatment above is essentially that of Lander and Waterman, Genomics 2: 231 (1 9B8). 

2. Random Library Construction 

In order to approximate the random model described above during actual sequencing, a nearly ideal library of 
cloned genomic fragments is required. The following library construction procedure was developed to achieve this end. 

Staphylococcus. aureus DNA was preparedly phenol extraction. A mixture containing 600 ug DNA in 3.3 ml of 
300 mM sodium acetate, 1 0 mM Tris-HCI, 1 mM Na-EDTA, 30% glycerol was sonicated for 1 min. at 0°C in a Branson 
Model 450 Sonicator at the lowest energy setting-using a 3 mm probe. ^The sonicated DNA.was ethanol precipitated 
and redissolved in 500 ul TE buffer. 

To create blunt-ends, a 1 00 ul aliquot of the resuspended DNA was digested with S units of BAL31 nuclease (New 
England BioLabs) for 10 min at 30°C in 200 u I BAL31 buffer . The digested:DN A was: phenol-extracted, ethanol-pre- 
cipitated, redissolved in 100 ul TE buffer,' and then size-fractionated by electrophoresis through a 1.0% low melting 
temperature agarose gel. The section containing DNA fragments 1 .6-2.0 kb in size was excised^rom the gel, and the 
LGT agarose was melted and the resulting solution was extracted with phenol to separate the agarose from the DNA. 
DNA was ethanol precipitated and redissolved in 20 ul of TE buffer for ligation to vector. 

A two-step ligation procedure was used to produce a plasmid library with 97% inserts, of which >99% were single 
inserts. The first ligation mixture (50 ul) contained 2 ug of DNA fragments, 2 ug pUC1 8 DNA (Pharmacia) cut with Smal 
and dephosphorylated with bacterial alkaline phosphatase, and 10 units of T4 ligase (GIBCO/BRL) and was incubated 
at 14°C for 4 hr. The ligation mixture then was phenol extracted and ethanol precipitated, and the precipitated DNA 
was dissolved in 20 ul TE buffer and electrophoresed on a 1 .0% low melting agarose gel. Discrete bands in a ladder 
were visualized by ethidium bromide-staining and UV illumination and identified by size as insert (i), vector (v), v+i, 
v+2i, v+3i, etc. The portion of the gel containing v+i DNA was excised and the v+i DNA was recovered and resuspended 
into 20 ul TE. The v+i DNA then was blunt-ended by T4 polymerase treatment for 5 min. at 37° C in a reaction mixture 
(50 ul) containing the v+i linears, 500 uM each of the 4 dNTPs, and 9 units of T4 polymerase (New England BioLabs), 
under recommended buffer conditions. After phenol extraction and ethanol precipitation the repaired v+i linears were 
dissolved in 20 ul TE. The final ligation to produce circles was carried out in a 50 ul reaction containing 5 ul of v+i 
linears and 5 units of T4 ligase at 14°C overnight. After 10 min. at 70°C the following day, the reaction mixture was 
stored at -20° C. 

This two-stage procedure resulted in a molecularly random collection of single-insert plasmid recombinants with 
minimal contamination from double-insert chimeras (<1%) or free vector (<3%). 

Since deviation from randomness can arise from propagation the DNA in the host, E.co//host cells deficient in all 
recombination and restriction functions (A. Greener, Strategies 3 (1):5 (1990)) were used to prevent rearrangements, 
deletions, and loss of clones by restriction. Furthermore, transformed cells were plated directly on antibiotic diffusion 
plates to avoid the usual broth recovery phase which allows multiplication and selection of the most rapidly growingcells. 

Plating was carried out as follows. A 100 ul aliquot of Epicurian Coli SURE II Supercompetent Ceils (Stratagene 
200152) was thawed on ice and transferred to a chilled Falcon 2059 tube on ice. A 1.7 ul aliquot of 1.42 M beta- 
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mercaptoethanol was added to the aliquot of ceils to a final concentration of 25 mM. Cells were incubated on ice for 
10 min. A 1 ul aliquot of the final ligation was added to the cells and incubated on ice for 30 min. The cells were heat 
pulsed for 30 sec. at 42° C and placed back on ice for 2 min. The outgrowth period in liquid culture was eliminated 
from this protocol in order to minimize the preferential growth of any given transformed cell. Instead the transformation 

5 mixture was plated directly on a nutrient rich SOB plate containing a 5 ml bottom layer of SOB agar (5% SOB agar: 
20 g tryptone, 5 g yeast extract, 0.5 g NaCI, 1 .5% Difco Agar per liter of media). The 5 ml bottom layer is supplemented 
with 0.4 ml of 50 mg/ml ampicillin per 100 ml SOB agar. The 15 ml top layer of SOB agar is supplemented with 1 m! 
X-Gal (2%), 1 ml MgCI 2 (1 M), and 1 ml MgSO 4 /100 ml SOB agar. The 15 ml top layer was poured just prior to plating. 
Our titer was approximately 100 colonies/10 ul aliquot of transformation. 

10 All colonies were picked for template preparation regardless of size. Thus, only clones lost due to "poison" DNA 

or deleterious gene products would be deleted from the library, resulting in a slight increase in gap number over that 
expected. 

3. Random DNA Sequencing 

15 

High quality double stranded DNA plasmid templates were prepared using an alkaline lysis method developed in 
collaboration with SPrime 3Prime Inc. (Boulder, CO). Plasmid preparation was performed in a 96-well format for all 
stages of DNA preparation from bacterial growth through final DNA purification. Average template concentration was 
determined by running 25% of the samples on an agarose gel. DNA concentrations were not adjusted. 

20 Templates were also prepared from a Staphylococcus aureus lambda genomic library. An unamplified library was 

constructed in Lambda DASH II vector (Stratagene). Staphylococcus aureus DNA (> 100 kb) was partially digested in 
a reaction mixture (200 ul) containing 50 ug DNA, 1X Sau3AI buffer, 20 units Sau3AI for 6 min. at 23 C. The digested 
DNA was phenol-extracted and centrifuges over a 10- 40% sucroce gradient. Fractions containing genomic DNA of 
15-25 kb were recovered by precipitation . One ul of fragments was used with 1 ul of DASHII vector (Stratagene) in 

25 the recommended ligation reaction. One ul of the ligation mixture was used per packaging reaction following the rec- 
ommended protocol with the Gigapack II XL Packaging Extract Phage were plated directly without amplification from 
the packaging mixture (after dilution with 500 ul of recommended SM buffer and chloroform treatment). Yield was about 
2.5x10 9 pfu/ul. 

An amplified library was prepared from the primary packaging mixture according to the manufacturer's protocol. 

30 The amplified library is stored frozen in 7% dimethylsulf oxide. The phage titer is approximately 1x10 s pfu/ml. 

Mini-liquid Jysates (0.1 ul) are prepared from randomly selected plaques and template is prepared by long range 
PCR. Samples are PCR amplified using modified T3and T7 primers, and Elongase Supermix (LTI). 

Sequencing reactions are carried out on plasmid templates using a combination of two workstations (BIOMEK 
1000 and r Hamilton Microlab 2200) and the Perkin-Elmer 9600 thermocycler with Applied Biosystems PRISM Ready 

35 Reaction Dye Primer Cycle Sequencing Kits for the M1 3 forward (M13-21) and the M13 reverse (M13RP1) primers. 
Dye terminator sequencing reactions are carried out on the lambda templates on a Perkin-Elmer 9600 Thermocycler 
using the Applied Biosystems Ready Reaction Dye Terminator Cycle Sequencing kits. Modified T7 and T3 primers are 
used to sequence the ends of the inserts from the' Lambda DASH II library. Sequencing reactions are on a combination 
of AB 373 DNA Sequencers and ABI 377 DNA sequencers. All of the dye terminator sequencing reactions are analyzed 

40 using the 2X 9 hour module on the AB 377. Dye primer reactions are analyzed on a combination of ABI 373 and ABI 
377 DNA sequencers. The overall sequencing success rate very approximately is about 85% for M1 3-21 and M1 3RP1 
sequences and 65% for dye-terminator reactions. The average usable read length is 485 bp for M13-21 sequences, 
445bp for M13RP1 sequences, and 375 bp for dye-terminator reactions. 

45 4. protocol for Automated Cycle Sequencing 

The sequencing was carried put using Hamilton Microstatiori 2200, Perkin Elmer 9600 thermocyclers, ABI 373 
and ABI 377 Automated DNA Sequencers. The Hamilton combines pre-aliquoted templates and reaction mixes con- 
sisting of deoxy- and dideoxynucleotides, the thermostable Taq DNA polymerase, fluorescently-labelled sequencing 
so primers, and reaction buffer. Reaction mixes and templates were combined in the wells of a 96-well thermocycling 
plate and transferred to the Perkin Elmer 9600 thermocycler. Thirty consecutive cycles of linear amplification (i.e.., one 
primer synthesis) steps were performed including denaturation, annealing of primer and template, and extension; i.e., 
DNA synthesis. A heated lid with rubber gaskets on the thermocycling plate prevents evaporation without the need for 
an oil overlay. 

55 Two sequencing protocols were used: one for dye-labelled f primers and a second for dye-labelled dideoxy chain 

terminators. The shotgun sequencing involves use of four dye-labelled sequencing primers, one for each of the four 
terminator nucleotide. Each dye-primer was labelled with a different fluorescent dye, permitting the four individual 
reactions to be combined into one lane of the 373 or 377 DNA Sequencer for.electrophoresis, detection, and base- 
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calling. ABI currently supplies premixed reaction mixes in bulk packages containing all the necessary non-template 
reagents for sequencing. Sequencing can be done with both piasmid and PCRrgenerated templates with both dye- 
primers and dye- terminators with approximately equal.fidelity, although piasmid templates generally give longer usable 
sequences. 

Thirty-two reactions were loaded per ABI 373 Sequencer each day and 96 samples can be loaded on an ABI .377 
per day. Electrophoresis was run overnight (ABI 373) or for 2.1/2 hours (ABI 377) following the manufacturer's protocols. 
Following electrophoresis and fluorescence detection, the ABI 373 or ABI 377 performs automatic lane tracking and 
base-calling. The lane-tracking was confirmed visually. Each sequence electropherogram (or fluorescence lane trace) 
was inspected visually and assessed for quality. Trailing sequences of low quality were removed and the sequence 
itself was loaded via software to a Sybase database (archived daily to 8mm tape). Leading vector polylinker sequence 
was removed automatically by a software program. Average edited lengths of sequences from the standard ABI 373 
or ABI 377 were around 400 bp and depend mostly on the quality of the template used for the sequencing reaction. 

INFORMATICS 

1. Data Management 

A number of information management systems for a large-scale sequencing lab have been developed. (For review 
see, for instance, Kerlavage etai, Proceedings of the Twenty-Sixth Annual Hawaii Internationa! Conference on System 
Sciences, IEEE Computer Society Press, Washington D. C, 585 (1993)) The system used to collect and assemble 
the sequence data was developed using the Sybase relational database management system and was designed to 
automate data flow whereever possible and to reduce user error. The database stores and correlates all information 
collected during the entire operation from template preparation to final analysis of the genome. Because the raw output 
of the ABI 373 Sequencers was based on a Macintosh platform and the data management system chosen was based 
on a Unix platform, it was necessary to design and implement a variety of mutti- user, client-server applications which 
allow the raw data as well as analysis results to flow seamlessly into the database with a minimum of user effort. 

2. Assembly 

An assembly engine (TIG Ft Assembler) developed for the rapid and accurate assembly of thousands of sequence 
fragments was enptoyed to generate contigs. The TIGR assembler simultaneously clusters and assembles fragments 
of the genome. In order to obtain the speed necessary to assemble more than 10 4 fragments, the algorithm builds a 
hash table of 12 bp oligonucleotide subsequences to generate a list of potential sequence fragment overlaps. The 
number of potential overlaps for each fragment determines which fragments are likely to fall into repetitive elements. 
Beginning with a single seed sequence fragment, TIGR Assembler extends the current contig by attempting to add 
the best matching fragment based on oligonucleotide content. The contig and candidate fragment are aligned using a 
modified version of the Smith-Waterman algorithm which provides for optimal gapped alignments (Waterman, M. S., 
Methods in Enzvmoloav 164: 765 (1988)). The contig is extended by the fragment only if strict criteria for the quality 
of the match are met. The match criteria include the minimum length of overlap, the maximum length of an unmatched 
end, and the minimum percentage match. These criteria are automatically lowered by the algorithm in regions of minimal 
coverage and raised in regions with a possible repetitive element. The number of potential overlaps for each fragment 
determines which fragments are likely to fall into repetitive elements. Fragments representing the boundaries of repet- 
itive elements and potentially chimeric fragments are often rejected based on partial mismatches at the ends of align- 
ments and excluded from the current contig. TIGR Assembler is designed to take advantage of clone size information 
coupled with sequencing from both ends of each template. It enforces the constraint that sequence fragments from 
two ends of the same template point toward one another in the contig and are located within a certain ranged of base 
pairs (definable for each clone based on the known clone size range for a given library). 

3. Identifying Genes 

The predicted coding regions of the Staphylococcus aureus genome were initially defined with the program zorf, 
which finds ORFs of a minimum length. The predicted coding region sequences were used in searches against a 
database of all Staphylococcus aureus nucleotide sequences from GenBank (release 92.0), using the BLASTN search 
method to identify overlaps of 50 or more nucleotides with at least a 95% identity. Those ORFs with nucleotide sequence 
matches are shown in Table 1 . The ORFs without such matches were translated to protein sequences and and com- 
pared to a non-redundant database of known proteins generated by combining the Swiss-prot, PIR and GenPept 
databases. ORFs of at least 80 amino acids that matched a database protein with BLASTP probability less than or 
equal to 0.01 are shown in Table 2. The table also lists assigned functions based on the closest match in the databases. 



24 



EP 0 786 519 A2 



ORFs of at least 120 amino acids that did not match protein or nucleotide sequences in the databases at these levels 
are shown in Table 3. 

ILLUSTRATIVE APPLICATIONS 

5 

1. Production of an Antibody to a Staphylococcus aureus Protein 

Substantially pure protein or polypeptide is isolated from the transfected or transformed cells using any one of the 
methods known in the art. The protein can also be produced in a recombinant prokaryotic expression system, such as 
10 E. coil or can by chemically synthesized. Concentration of protein in the final preparation is adjusted, for example, by 
concentration on an Amicon filter device, to the level of a few micrograms/ml. Monoclonal or polyclonal antibody to the 
protein can then be prepared as follows. 

2. Monoclonal Antibody Production by Hybrldoma Fusion 

15 

Monoclonal antibody to epitopes of any of the peptides identified and isolated as described can be prepared from 
murine hybrjdomas according to the classical method of Kohler, G. and Milstein, C, Nature 256:495 (1 975) or modifi- 
cations of the methods thereof. Briefly, a mouse is repetitively inoculated with a few micrograms of the selected protein 
over a period of a few weeks. The mouse is then sacrificed, and the antibody producing celts of the spleen isolated. 

20 The spleen cells are fused by means of polyethylene glycol with mouse myeloma cells, and the excess unfused cells 
destroyed by growth of the system on selective media comprising aminopterin (HAT media). The successfully fused 
cells are diluted and aliquots of the dilution placed in wells of a mic rot iter plate where growth of the culture is continued. 
Antibody-producing clones are identified by detection of antibody in the supernatant fluid of the wells by immunoassay 
procedures, such as ELISA, as originally described by Engvall, E., Meth. Enzymol. 70:419 (1980), and modified meth- 

25 ods thereof. Selected positive clones can be expanded and their monoclonal antibody product harvested for use. 
Detailed procedures for monoclonal antibody production are described in Davis, L. era/. Basic Methods in Molecular 
Biology Elsevier, New York. Section 21-2 (1989). 

3. Polyclonal Antibody Production by Immunization 

Polyclonal antiserum containing antibodies to heterogenous epitopes of a single protein can be prepared by im- 
munizing suitable animals with the expressed protein described above, which can be unmodified or modified to enhance 
immunogenicity. Effective polyclonal antibody production is affected by many factors related both to the antigen and 
the host species. For example, small molecules tend to be less immunogenic than other and may require the use of 

35 carriers and" ? adjuvant. Also, host animals vary in response to site of inoculations and dose, with both inadequate or 
excessive doses of antigen resulting in low titer antisera. Small doses (ng level) of antigenadministered at multiple 
intradermal sites appears to be most reliable. An effective immunization protocol for rabbits can be found in Vaitukaitis, 
J. et a/., J. Clin. Endocrinol. Metab. 33:988-991 (1971). 

Booster injections can be given at regular intervals, and antiserum harvested when antibody titer thereof, as de- 

40 termined semi-quantrtatively, for example, by double immunodiffusion in agar against known concentrations of the 
antigen, begins to fall. See, for example, Ouchterlony, O. et al, Chap. 19 in: Handbook of Experimental Immunology, 
Wier, D., ed, Blackwell (1973). Plateau concentration of antibody is usually in the range of 0. 1 to 0. 2 mgAnl of serum 
(about 1 2M). Affinity of the antisera for the antigen is determined by preparing competitive binding curves, as described, 
for example, by Fisher, D., Chap. 42 in: Manual of Clinical Immunology, second edition, Rose and Friedman, eds., Amer. 

45 Soc. For Microbiology, Washington, D. C. (1980) 

Antibody preparations prepared according to either protocol are useful in quantitative immunoassays which de- 
termine concentrations of antigen-bearing substances in biological samples; they are also used semi- quantitatively 
or qualitatively to identify the presence of antigen in a biological sample. In addition, they are useful in various animal 
models of Staphylococcal disease known to those of skill in the art as a means of evaluating the protein used to make 

so the antibody as a potential vaccine target or as a means of evaluating the antibody as a potential immunothereapeutic 
reagent. 

3. Preparation of PCR Primers and Amplification of DNA 

55 Various fragments of the Staphylococcus aureus genome, such as those of Tables 1 -3 and SEQ ID NOS: 1 -5, 1 91 

can be used, in accordance with the present invention, to prepare PCR primers for a variety of uses. The PCR primers 
are preferably at least 15 bases, and more preferably at least 18 bases in length. When selecting a primer sequence, 
it is preferred that the primer pairs have approximately the same G/C ratio, so that melting temperatures are approxi- 
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matety the same. The PCR primers and amplified DNA of this Example find use in the Examples that follow. 
4. Gene expression from DNA S quences Corresponding to ORFs 

s A fragment of the Staphylococcus aureus genome provided in Tables 1 -3 is introduced into an expression vector 

using conventional technology. Techniques to transfer cloned sequences into expression vectors that direct protein 
translation in mammalian, yeast, insect or bacterial expression systems are well known in the art. Commercially avails 
able vectors and expression systems are available from a variety of suppliers including Stratagene (La Jolla, California), 
Prpmega (Madison, Wisconsin), and Invitrogen (San Diego, California). If desired, to enhance expression and facilitate 

10 proper protein folding, the codon context and codon pairing of the sequence may be optimized for the particular ex- 
pression organism, as explained by Hatfield et al, U. S. Patent No. 5,082,767, incorporated herein by this reference. 

The following is provided as one exemplary method to generate polypeptide(s) from cloned ORFs of the Staphy- 
lococcus aureus genome fragment. Bacterial ORFs generally lack a poly A addition signal. The addition signal sequence 
can be added to the construct by, for example, splicing out the poly A addition sequence from pSG5 (Stratagene) using 

ts Bgll and Sail restriction endonuclease enzymes and incorporating it into the mammalian expression vector pXT1 (Strat- 
agene) for use in eukaryotic expression systems. pXT1 contains the LTRs and a portion of the gag gene of Moloney 
Murine Leukemia Virus. The positions of theLTRs in the construct allow efficient stable transfection. The vector includes 
the Herpes Simplex thymidine kinase promoter and the selectable neomycin gene. The Staphylococcus aureus DNA 
is obtained by PCR from the bacterial vector using oligonucleotide primers complementary to the Staphylococcus 

20 aureus DNA and containing restriction endonuclease sequences for Pstl incorporated into the 5' primer and BgtH at 
the 5' end of the corresponding Staphylococcus aureus DNA 3' primer, taking care to ensure that the Staphylococcus 
aureus DNA is positioned such that its followed with the poly A addition sequence. The purified fragment obtained from 
the resulting PCR reaction is digested with Pstl, blunt ended with an exonuctease, digested with Bglil, purified and 
ligated to pXT1 , now containing a poly A addition sequence and digested Bglll. 

2S The ligated product is t ran sfected into mouse NIH 3T3 cells using Lipofectin (Life Technologies, Inc., Grand Island, 

New York) under conditions outlined in the product specification. Positive transfectants are selected after growing the 
transfected cells in 600 ug/ml G41 8 (Sigma, St. Louis, Missouri). The protein is preferably released into the supernatant. 
However if the protein has membrane binding domains, the protein may additionally be retained within the cell or 
expression may be restricted to the cell surface. Since it may be necessary to purify and locate the transfected product, 

30 synthetic 15-mer peptides synthesized from the predicted Staphylococcus aureus DNA sequence are injected into 
mice to generate antibody to the polypeptide encoded by the Staphylococcus aureus DNA. 

Alternately and if antibody production is not possible, the Staphylococcus aureus DNA sequence is additionally 
incorporated into eukaryotic expression vectors and expressed as, for example, a globin fusion. Antibody to the globin 
moiety then is used to purify the chimeric protein. Corresponding protease cleavage sites are engineered between the 

35 globin moiety and the polypeptide encoded by the Staphylococcus aureus DNA so that the latter may be freed from 
the formed by simple protease digestion. One useful expression vector for generating globin chimerics is pSG5 (Strat- 
agene). This vector encodes a rabbit globin. Intron II of the rabbit globin gene facilitates splicing of the expressed 
transcript, and the polyadenylation signal incorporated into the construct increases the level of expression. These 
techniques are well known to those skilled in the art of molecular biology. Standard methods are published in methods 

40 texts such as Davis et aL, cited elsewhere herein, and many of the methods are available from the technical assistance 
representatives from Stratagene, Life Technologies, Inc., or Promega. Polypeptides of the invention also may be pro- 
duced using in vitro translation systems such as in vitro ExpressTM Translation Kit (Stratagene). 

While the present invention has been described in some detail for purposes of clarity and understanding, one 
skilled in the art will appreciate that various changes in form and detail can be made without departing from the true 

45 scope of the invention. 

All patents, patent applications and publications referred to above are hereby incorporated. by reference. 
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20 
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i 
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25 
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i 


5.4 








20.4 








328.2 ! 
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771.1 ! 






999.1 
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40 
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74£_1 ! 
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69.3 1 


45 


70.6 ! 




129_2 ! 




58.5 
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SO 


236.6 i 


310.8 








601.1 




544.3 




662.1 


55 
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8-17 


36-52 


_ oo-y p ! 


11? 1^1 
i_ 1 <-_!!«£: J 


63_4 


5242 
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_y o - i_U/_ ;_ 
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5243 ketopantoate hydroxymeth 
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OCC ^ ~7 yl 
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ornithine acetyl transferase : 


1-10 


3*4-43 


5 4- do : 


J 94-21 0 _ 


267.1 


15245 


r>iari*anuporLer protein ^c. r 
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322_1 


15246 


acriflavin resistance protein 


58-75 


153-164 


203-231 | 


264-284 


415.2 


5247 


transport ATP-binding prot< 


108-126 


218-227 


298-308 i 


315-334 


214_3 


"!52'48" 


2-nitropropane dioxygenasc 


123-136 


216-233 


283-292 I 


297-306 


587_3 


15249 


; clumping factor 


5-14 


43-54 


59-68 1 


76-95 


685_1 
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signal peptidase 


59-68 


72-81 


86-95 J 


99-108 


54_3 


fibronectin binding protein 1 


23-32 


37-46 


50-59 _j 


89-98 


54_4 
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fibronectin binding protein 1 


"43-52 


66-75™" 


95-104 * 


_J47-1_56" 


54_5 


T5253 


fibronectin binding protein 1 


49-60 


81-90 


i 
1 




54_6 
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fibronectin binding protein 1 


55-71 


82-97 


139-158 j 


Z*i?S:f8T 
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lipoprotein (H. flu) 


11-20 


61-70 


1 96-105 i 
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i 


275-284 
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298-319 
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486-495 
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214_3 


31 8-337 


— 


- 
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"537 J3 r 
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204-213 
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130-145 ! 








54_3 ! 
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217-226 
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295-305 


54_4 X 


175-188 




191-200 


203-212 


220-229 






54_5 1 




-I 












54_6 t 
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344-353 
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1 
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i i 
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415.2 


i i 
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■ 
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■ 
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318-327 
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351-360 
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i 
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... 
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"i ~" ™ 
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455-462 


472-491 
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i 


54.4 






i 




I- .-.-.!-.- 


54.5 














f 


54.6 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NAME: Human Genome Sciences, Inc. 

(B) STREET: 9410 Key West Avenue 

(C) CITY: Rockville 

(D) STATE: Maryland 

(E) COUNTRY : US 

(F) POSTAL CODE: 20850 

(ii) TITLE OF INVENTION: Staphylococcus aureus Poly- 
nucleotides and Sequences 

(iii) NUMBER OF SEQUENCES: 5255 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette, 3.50 inch, 1 . 4 Mb storage 

(B) COMPUTER: HP Vectra 4 86/3 3 

- (C) OPERATING SYSTEM: MSDOS version 6.2 

(D) SOFTWARE: ASCII Text 

(vi> CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: . 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 60/009,861 
■(B) FILING DATE: 05-JAN-1996 



(2) INFORMATION FOR SEQ ID NO:l 



.55 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 58 95 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



10 



20 



25 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

TCCATTATGA AGTCACAAGT ACTATAAGCT GCGATGTTAC CAATGTTTTT TAAAATCCCA 60 

GTAATAAAAT CAAAAAATAA GTTAAATAAT GTATTCATTT TAAGTCCTCC TTAATAAAGa 120 

15 aaataGGTAA TAATGTAATA GCTTCTATTA TGATGCCTAA TTGAATGAAT TGGGCAAATG 180 

GCTCTTTGAT GATAAGTGTG ATAATGAAAA GGGTTAAACT AACAATAATC GCATAATATT 240 

TTTTTCGTTT AATAAGTCGC ACAGGAATGG GCTTCTTTTT AGTTGCTGCA GGAGCATATA 300 

CTGAGATTAC ACCTAAAGAA ATAACTGTTA AAATAATCAT AATTAAAAAG TTAATATGAA 3 60 

AATTTACTAT TACTAAAGGT AAAAGTATAA ATAGTATAAT ACTTTCTACA TAACACCAAA 420 

AAGAAGAAGG TGCATGTGCa CCATGTGCAT GtCTTCTTAT TAAATAAAAT GTTAAATTCG 480 

TAATTAACGT AAACAGAAAA ATGTTTAAAA TATAGGCAAT AGTATACATA ACAATTAATT 540 

TACCTATATT TTTAGCTAAG ACCTGCATCC CTAATCGTAC TTGCAAAAAT TGAATATGAT 600 

3Q CTAAGTTATT TCTCTTTTGA AGATACGTGG . CAAACTGGTC AATTTTATTA TCAAAATAAT 660 

TCAATTTTAC ACCACTCTCC TCACTGTCAT TATACGATTT AGTACAATCT TTTATCATTA 72 0 

TATTGCCTAA CTGTAGGAAA TAAATACTTA ACTGTTAAAT GTAATTTGTA TTTAATATTT 780 

35 TAACATAAAA AAATTTACAG TTAAGAATAA AAAACGACTA GTTAAGAAAA ATTGGAAAAT 84 0 

AAATGCTTTT AGCATGTTTT AATATAACTA GATCACAGAG ATGTGATGGA AAATAGTTGA 900 

TGAGTTGTTT AATTTTAAGA ATTTTTATCT TAATTAAGGA AGGAGTGATT TCAATGGCAC 960 

40 

AAGATATCAT TTCAACAATC GGTGACTTAG TAAAATGGAT TATCGACACA GTGAACAAAT 102 0 

TCACTAAAAA ATAAGATGAA TAATTAATTA CTTTCATTGT AAATTTGTTA TCTTCGTATA 108 0 

GTACTAAAAG TATGAGTTAT TAAGCCATCC CAACTTAATA ACCATGTAAA ATTAGCAAGT 114 0 

45 

GAGTAACATT TGCTAGTAGA GTTAGTTTCC TTGGACTCAG TGCTATGTAT TTTTCTTAAT 1200 

TATCATTACA GATAATTATT TCTAGCATGT AAGCTATCGT AAACAACATC GATTTATCAT 1260 

SO TATTTGATAA ATAAAATTTT TTT CATAATT AATAACATCC CCAAAAATAG ATTGAAAAAA 1320 

TAACTGTAAA ACATTCCCTT AATAATAAGT ATGGTCGTGA GCCCCTCCCA AGCTCGCGGC 13 80 

CTTTTTTGTA ATGAAGAAGG GATGAGTTAA TCATCATTAT GAGACCCGCC GTTAAAATAT 1440 

55 
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TCATTTGCAA AGGGCGAAAT GGGTTCTTAC 

GACTTATGAA AAATCTCTCA TAAATCTATG 

5 

CGGGCGCTTC TTATTTATAC AAATCTAATT 

GTTGCTGTTC TACTTCATTT AAGTTTAAAT 

ATTC TCCAAC TAAATCTCCA TTTGGGTTTA 

10 

TACCATCGAA TCCAGTGCTA TTAGTTCCAA 
CCTTTAGTAA TGAATGCCAA TGTTGAAGAC 

15 CAATTTTAGC ACCACTACGA GCAGGATATC 

AGATAAGTTG GGTCACATAA GTACCGTCAG 
CAGCGGTTAA AAATTCATGC TCTCTTAACA 

20 TAATCAG CTG GCCACTTTTA TTCACAGTAA 

TGTTAGAAAC TGACC CAGCT ACGATATCGA 
ATGAAAAACT TTGTC CTAGA TTATTATCTG 
CATTATTCCA CATTTCAGGT AAAACGACTA 
ACCATTGCGT TATTTGAGTT TCATTTTTAG 

30 AAATTTGGAC TTTCATAACA TCACATCCTT 

TATGTTGAAA CGCAAAAAAC GAGCACAAGA 
TTATATTGAC AGTAGTTGAT GGGGCCCCAA 

55 GACAATGCAA GTTGGGGTGG GCTCTAACAT 

TTTCTTATAC ATGAGTTTTA CTCATGTATT 
AATGTGTAAG AACTACTACA TAATGAATAA 

40 

CCTAACAATA TATTGATTAT TTTTTTATTA 
TTTGGCCAGC AGCTTCACGA ATATCACCAA 
TAGGAATATT AAATTCATTT GAAGTCATCT 

45 

AAGCACCTAT GCCTTTAGTA GCTAATGCAG 
TTTGAGTTGA CCATATTGCA AAATTATCAT 
50 TTACAACATC TTGATCTTCA TAAAACAAAA 
TTTTTTGTTC AGTTGGCTCG AAATCACGAT 
TTGTGTTATC CCAAAATTTA TTATTGTTGT 

55 



TGAGTTATCT ATTATAAAAA AATAAACATA 15 $0 

TTTAGTCATG aCATGTGTTA AATATTATTT 1620 

TAATACTTTT AAATACAGGT ATATTTTCgC 1680 

CTACAGTCAA AATATCTGCG GATTCATTTA 174 0 

TAACTATCGA ATGACCAGCA TATTCTGTGT 1800 

TGACAAACAT ATTATTTTCA ATTGCACGTG I860 

GTGAGATAGG CCATTGCGCC ACATAAAATG 1920 

TTAATAATTC TGGAAAACGT AAATCATAAC 1980 

ACAATTGAAA GGGTTCAGCT ACGTATTCGC 2040 

TAGGAAGTAA ATGAACTTTG TCGTATTCaT 2100 

AAGCTGTATT AAATATTTGA TTGTTTCTAA 2160 

CTTTATATTT TTCAGCTAAA TGTTTAATAA 2220 

CTTTTTCATT TAAATGCTCT AAATCATAGC 2280 

CATCTACTTC AGCATTCATA TTTTTTTCGA 2340 

AACTATCTCC AAAAACAATC GGTAATTGAT 2400 

GATAGATCTT ATATATAACT TACTAAAAGT 2460 

CATAAAATCA AAGTCCTAGG CTCTACAAAG 2520 

CATAGAGAAA TTGGAACACC AATTTCTACA 2580 

AAAGAAATAC TTTTTCTTTA GAAATTAGTA 2640 

CCTATTCTTA AGTGCACATT AGCAGCGGCT 2700 

CTAATGATTC TTTATCATTT CTGTCCCATT 2760 

CGAAACGATC TTCCACTGGA TTAAATGTTT 2820 

ATGGCATTTG AGCAATAAGT TTCCAAGTTT 2880 

CATCAACAAG TGGATTATAG TGTTGTAATG 294 0 

TCCAAATTGC AAATTGATGC ATGGCATTTG 3 000 

AGTAGTTTGG CATTTGTTCT TGTAAACCAC 3060 

TTGTACCGTA TGAATGTTTG AAGTTATCAA 3120 

TCTCTCCCAT GACTTCTTTT AAAATTGCTT 3180 

CATTTAACAA GAGAACAATT CTAGTTGATT 3240 
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CATCGCTAAT TGATATCGAA TCTTTCAAAT 
TGTCAAAAGT CATTGCTTTT TTATCTTTTT 
5 TAGTAAAGAA TACTTAATAG ACTAAGTATA 

TTACGAAAAT TTCAAGAATA TTAATATTCA 
AACGCATATT TATTATACTT AGATTAATAC 

10 

TGTCATATCA TTGGTTTAAG AAAATGTTAC 
GTAGTTTAGG GCTTGCAACG CACACAGfTG 

75 CAACTACTAA TTTGAATCAT AATATAACTT 
ATGAGACTGG GACACCTCAC GAATCAAATC 
GTCGTGATGC TAATCCTGAT TCGAATAATG 

20 GTACAGATTC AAAACCAGAC CCAAATAACC 
CAGATAACCC GAAACCAAAA CCGGATCCAA 
CGGATCCAAA ACCAGAT CCA GATAACCCGA 

25 

ATAAACCAAA GCCAAATCCG GATCCAAAAC 
ATCCAAAACC AGACCCTAAT AAGCCAAATC 

3Q GGGATTCCAA TCATTCTGGT GGCTCGAAAA 
ATGGATCTAA TCAAGGTCAA TGGCAACCAA 
. CTGGTAATG A TTTTGTATCC CAACGATTTT 

35 ATCCGTATAT TTTAAATCAA ATTAATAAGT 
AAGACATTTA TAATATTATT CGAAAACAAa 
TACAACAGCA ATCGAATTAC TTTAGATTCC 

40 

ACTATCGTAA TTTAGATGAA CAAGTACTCG 
CAGATTTGAA AAAGCCCGAA GATAAGCCGG 
AAAAAGACGA TTTTACAGTA GTTAAAAAAC 

45 

CATATAGTAA AAGTTGGCTA GCAATTGTAT 
TATTCTTATT TGTAAAGCGA AATAAAAAGA 
SO CCGTGTGTGA TTCGTTTTTT TTATTATGGA 
TCCGTGGCTT TTTTCAAAGC CTCAGGATTA 
TGTAACATAT GGATAATAAT TGGAACAGCA 

55 



TATATATTGA ACGTCTTTCT TCCATTGCAT 336 0 

TAAATAAGCC CATAATTATT GCTCCTTCTT 3420 

AAATTTATAC TCGTACTTGT AAAGCAATAT 3480 

TTTTCAAATT CCAAATATAA ATGCATTTTC 354 0 

TTACATGAAA AAGGGAGGTG TCTCGTGAAA 3S00 

TTTCAACAAG TATTTTAATT TTAAGTAGTA 3660 

AAGCAAAGGA TAACTTAAAT GGAGAAAAAC 3720 

CACCATCAGT AAATAGTGAA ATGAAT AATA 3780 

AAACGGGTAA TGAAGGAACA GGTTCGAATA 3840 

TGAAGCCAGA CTCAAACAAC CAAAACCCAA 3900 

AAAACTCAAG TCCGAATCCT AAACCAGATC 3 960 

AACCAGACCC AGATAAACCA AAGCCAAATC 4 020 

AACCAAATCC AGATCCAAAA CCAGACCCAG 4080 

CAGATCCAGA TAAACCAAAG CCAAATCCGA 41*4 0 

CTAACCCGTC ACCAGATCCC GATCAACCTG 4200 

ATGGGGGGAC ATGGAACCCA AATGCTTCAG 4260 

ATGGGAATCA AGGAAACTCA CAAAATCCTA 4320 

TAGCCTTGGC AAATGGGGCT TACAAGTATA 4380 

TGGGCAAAGA TTATGGAGAA GTTACTGATG 44 4 0 

ATTTCAGCGG AAATGCATAT TTAAATGGAT 4500 

aATATTTCAA TCCATTGAAA TCAGAAAGGT 4560 

CATTAATTAC TGGTGAAATT GGATCAATGC 462 0 

ATTCAAAACA ACGCTCATTT GAACCGCATG 4680 

AAGAAGATAA TAAGAAAAGT GCGTCAACTG 474 0 

GTTCTATGAT GGTGGTATTT TCAATCATGC 4 800 

AAAATAAAAA CGAATCACAG CGACGATAAT 4860 

ATAAAAATGT GATATATAAA ATTCGCTTGT 4 920 

AGTAATTGGA ATATAACGAC AAATCCGTTT 4 980 

AGCCGTTTTG TCCAAACATA TGCTAATGAA 504 0 
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AATATTAATG AACTTACTGT TGTAGCAATA ATAAATGCCA CGATACGATT ACCTTTAATC 5160 

GCATTAAATA ATTCTCCAAA GATTACTTTT CTGAATACAT ATTCTTCTAA TAAAGGACCA 5220 

ATAATAGATA CAAAGAAGAT AAATATAGGT ATTTTTCGAG CAATAATAAT TAGCTTTTCT 5280 

GTATTAGGAC TTACTTGTTG TCGACCATAA ATTTGCGTTA ATACAATGCT CACTACCATT 5340 

TGATAAATCA TTACCAATGC AAATCCAAGC AATGCCCATG GAATGATATA TTTTTTAGGT 54 00 

TCTTTAACTT. CTAATTCTAA TTTTGTTGGA TTTTTAATTT TTAAATTAAT TAAAATAATC .5460 

GTCGTGGCGG CGATTAAAAA TAGAACAAGT TGTATGTAAA TGACTGCTTT AGTCAGTTCT 5520 

15 ATGCCACTAT ATTGTACAAA TGGTAATTTT TTTACAATGA GAAGCGGTAA AAATTGAGAC 5580 

AATATATAAA TAATAACAGT TAGCAATGAT GCCCATAATC t TGTCATAAT TTTCCTCCAA 5640 

ATATTTGTTT ATAATTTATT TTATCGTAAA TAACTTGAAG TTACAAAACT TAATTAAAAG 5700 

GTTATGACTT GAAATTTTGA CCAAATTTGA TTATTATAAA TGTATGTTAG CACTCTTTAA 5760 

TGTTAAGTGC TAAACTTTAG GTTTTTTAAG GAGGAACAAT CATGCTAAAA CCAATTGGAA 5820 

ATCGTGTGAT TATTGAGAAA AAAGAACAAG AACAAACAAC TAAAAGTGGn ATTGTTTAAC 5880 

TGATAGTGCT AAAGA 5895 
(2) INFORMATION FOR SEQ ID NO: 2: 

30 (i) SEQUENCE- CHARACTERISTICS : . 

(A) LENGTH: 6796 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
rr , (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

TTTGAAAAAA CAAGGTACGA TTGGTTTAAT AACATATATG AGAACCGATT CTACACGTAT - 60 

40 

TTCaGATACT GCCAAAGTTG AAGCAAAACA GTATATAACT GATAAATACG GTGAATCTTA 120 

. CACTTCTAAA CGTAAAGCAT CAGGGAAACA AGGTGACCaA GATGCCCATG AGGCTATTAG 180 

ACCTTCAAGT ACTATGCGTA CG CCAGATG A TATGAAGTCA TTTTTGACGA AAGACCAATA 240 

45 

CCGATTATAC AAATTAATTT GGGAACGATT TGTTGCTAGT CAAATGGCTC CAGCAATACT 300 

TGATACAGTC TCATTAGACA TAACACAAGG TGACATTAAA TTTAGAGCGA ATGGTCAAAC 360 

SO AATCAAGTTT AAAGGATTTA TGACACTTTA TGTAGAAACT AAAGATGATA GTGATAGCGA 420 

AAAGGAAAAT AAACTGCCTA AATTAGAGCA AGGTGATAAA GTCACAGCAA CTCAAATTGA 4 80 

ACCAGCTCAA CACTATACAC AACCACCTCC AAGATATACT GAGGCGAGAT TAGTAAAAAC 540 

55 
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AAAGCGTAAC TATGTCAAAT TAGAAAGTAA GCGTTTTGTT CCTACTGAGT TGGGAGAAAT 660 

AGTTCATGAA CAAGTGAAAG AATACTTCCC AGAGATTATT GATGTGGAAT TCACAGTGAA 720 

5 

TATGGAAACG TTACTTGATA AGATTGCAGA AGGCGACATT ACATGGAGGA AAGTAATCGA 780 

CGGTTTCTTT AGTAGCTTTA AACAAGATGf TGAACGTGCT GAAGAAGAGA TGGAAAAGAT 84 0 

10 TGAAATCAAA GATGAGCCAG GCGGTGAAGA CTGTGAAATT TGTGGTTCTC CTATGGTTAT 900 

AAAAATGGGA CGCTATGGTA AGTTCATGGC TTGCTCAAAC TTCCCGGATT GTCGTAATAC 960 

AAAAGCGATA GTTAAGTCTA TTGGTGTTAA ATGTCCAAAA TGTAATGaTG GTGACGTCGT 1020 

75 AGAAAGAAAA TCTAAAAAGA ATCGTGTCTT TTATGGATGT TCGAAATATC CTGAATGCGA 1080 

CTTTATCTCT TGGGATAAGC CGATTGGAAG AGATTGTCCA AAATGTAACC AATATCTTGT 1140 

TGAAAATAAA AAAGGCAAGA CAACACAAGT AATATGTTCA AATTGCX3ATT ATAAAGAGGC 1200 

20 

AGCGCAGAAA TAATATTTTT ATTTCCTAGA TACATTTTAA GATTGTTAAA TAGAATCATT 1260 

AGTGAATCTT ATTTTAAAGA TAGTAAAGGA TTAATCTAAA TAAGTGCGGA TAATATAAAC 1320 

ATAACAACAT AATTAAmAGA CATAAATGAC aATAAAAGGA GTATAGAAAT GACTCAAACT 13 80 

25 

GTAAATGTAA TAGGTGCTGG TCTTGCCGGT TCAGAAGCGG CATATCAATT AGCTGAAAGA 1440 

GGAATTAAAG TTAATCTAAT AGAGATGAGA CCTG TTAAAC AAACACCAGC GCACCATACT 1500 

30 GATAAATTTG CGGAACTTGT ATGTTCCAAT TCATTACGCG GAAATGCTTT AACTAATGGT 1560 

GTGGGTGTTT TAAAAGAAGA AATGAGAAGA TTGAATTCTA TAATTATTGA AGCGGCTGAT 1620 

AAGGCACGAG TTCCAGCTGG TGGTG CATTA GCAGTTGATA GACACGATTT TTCAGGTTAT 1680 

35 ATTACTGAAA CACTTAAAAA TCATGAAAAT ATCACAGTTA TTAATGAAGA AATTAATGCC 174 0 

ATTCCAGATG GATACACAAT TATCGCAACA GGACCACTTA CTACAGAAAC CCTTGCGCAA 1800 

GAAATAGTGG ACATTA CTGG TAAAGATCAA CTTTATTTCT ATGATGCGGC TGCTCCAATT 1860 

40 

ATTGAAAAAG AATCTATTGA TATGGATAAA GTTTACTTAA AGTCCCGTTA TGATAAAGGT 1920 

GAAGCTGCAT ATTTAAACTG TCCTATGACT GAGGATGAAT TTAATCGCTT TTATGATGCA 1980 

45 GTATTAGAAG CTGAAGTTGC GCCTGTAAAT TCATTTGAAA AAGAAAAATA TTTCGAGGGT 2040 

TGTATGCCTT TTGAAGTAAT GGCAGAACGC GGACGCAAGA CATTACTATT TGGACCAATG 2100 

AAACCAGTAG GATTAGAAGA TCCAAAGACT GGGAAACGTC CTTATGCGGT GGTTCAATTA 2160 

50 AGACAAGATG ACGCTGCTGG TACACTCTAC AATATTGTTG GCTTCCAAAC GCATTTAAAA 2220 

TGGGGAGCTC AAAAAGAAGT CATTAAATTA ATTCCAGGCT TAGAAAATGT TGATATTGTT 2280 

AGATATGGTG TG ATG CAT AG AAATACCTTC ATTAATTCAC CGGACGTATT AAACGAGAAA 2340 

55 
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TATGTAGAAA GCGCAgcTAG CGGCTTAGTT GCAGGTATCA ATCTTGCGCA TAAAATATTA 246 0 

GGCAAGGGTG AGGTAGTATT TCCGAGAGAA ACAATGATTG GAAGTATGGC TTACTATATT 2520 

5 

TCTCATGCTA AAAACAATAA GAATTTCCAA CCTATGAATG CTAACTTCGG GTTATTACCA 2580 

TCTTTAGAAA CTAGAATTAA AGATAAAAAA GAACGCTATG AAGCACAAGC TAATAGAGCT 264 0 

1Q . TTGGATTACT TAGAAAATTT CAAAAAAACT TTATAAAATA GTTAGAAAGA CTAGATATGC 2700 

TATTCATTCT TAAGTCATCA ACGAGTAAGT AATGACTTTC TAAATGGAAA ATACTTATCC 2760 

TAGTCTTTTT AATTTTGGAA TTGTTACGTA TTTCTGACAA TTTAGAATTC GCATTCAAAA 2820 

15 AATATCTAAA TAAATAACAC GCAATAAGTT GATTGATGTA ACATGTAAGA GAATGTTTTA 2880 

AATAAACTTT ATTTAAAAGG CAATGAAATA ATAAATGGCA AGGCTATTAA TAAAGACTTT 294 0 

TAGTAATTAA TTTAAAAAAG AGGTATTCTA ATTAACAGGT TTTCCGATTA GTTACAATTA 3000 

20 

TTTAATTCTC AAAAGATTTA GAATTGATTA TCAAATTACT GTAAGCCCTT TGCTGTATAT 3060 

GCTACAATTC TTATTGATGG AGGGTAAATG TATTGAATCA TATTCAAGAT GCGTTTTTAA 3120 

ATACATTGAA AGTTGAACGG AATTTTTCGG AACACACATT GAAAT CAT AT CAAGATGACT 3180 

25 

TAATTCAGTT TAATCAATTT TTAGAACAAG AACATTTAGA GTTGAATACT TTTGAATACA 324 0 

GAGATGCTAG AAATTATTTG AGCTATTTAT ATTCAAATCA TTTGAAAAGA ACATCTGTTT 33 00 

30 CTCGTAAAAT CTCAACGTTA AGAACTTTCT ATGAATATTG GATGACGCTT - GATGAGAACA 336 0 

TTATTAATCC ATTTGTTCAA TTAGTACATC CGAAAAAAGA AAAATATCTT CCGCAATTCT 3420 

TTTACGAAGA AGAAATGGAA GCGTTATTCA AAACTGTAGA AGAGGACACT TCAAAAAATT 34 80 

35 TACGGGATCG AGTTATTCTT GAATTGTTGT ATGCTACAGG CATCCGTGTT TCGGAATTAG 3 540 

TAAATATTAA AAAACAAGAT ATAGATTTTT ACGCGAATGG TGTTACCGTA TTAGGAAAAG 3600 

GGAQCAAAGA GCGCTTTGTA CCGTTTGGTG CTTATTGTAG ACAAAGCATC GAAAATTATT 3660 

40 

TAGAACATTT CAAACCAATT CAGTCATGCA ATCATGATTT TCTTATTGTA AATATGAAGG 3720 

GTGAAGCAAT CACTGAACGC GGTGTACGAT ATGTTTTAAA TGATATTGTT AAACGAACAG 3780 

45 CAGGCGTAAG TGaGATTCAT CCCCACAAGC TCAGACATAC ATTTGCAACG CATTTATTGA 384 0 

ATCAAGGTGC AGACCTAAGA ACAGTACAAT CGTTATTAGG TCATGTTAAT TTGTCAACAA 3900 

CTGGTAAATA TACACACGTA TCTAACCAAC AATTAAGAAA AGTGTATCTA AATGCACATC 3 960 

50 CTCGAGCGAA AAAGGAGAAT GAAACATGAG TAATACAACA TTACATG CAA CAACAATTTA 4020 

TGCTGTAAGA CATAATGGGA AAGCAGCTAT GGCTGGAGAT GGGCAAGTAA CGCTTGGTCA 4080 

ACAAGTCATC ATGAAACAAA CGGCAAGAAA AGTGCGACGT TTATATGAAG GTAAAGTGTT 4140 
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ATTACAACAG 


TTTAGTGGTA 


ACTTAGAAAG 


AGCTGCTGTT 


GAATTGGCAC 


AAGAATGGCG 


4260 


5 


AGGCGATAAA 


, CAATTACGTC 


AATTAGAAGC 


TATGCTAATT 


GTAATGGATA 


AAGATGCTAT 


4320 




TTTAGTTGTC 


AGTGGAACTG 


GCGAAGTTAT 


TGCTCCAGAT 


GATGACCTTA 


TCGCTATTGG 


4380 




ATCAGGAGGC 


AACTACGCAT 


TAAG CGCAGG 


ACGTGCATTG 


AAACGCCATG 


CATCGCATTT 


4440 


10 


. GTCTGCTGAA 


GAAATGGCAT 


ATGAGAGCTT 


GAAAGTAGCG 


GCTGATATTT 


GTGTCTTTAC 


4500 




CAACGATAAT 


ATTGTTGTCG 


AAACACTATA 


ATAATCAGAG 


CACGATAAAT 


AATTACGAGC 


4560 




AATTAATTTT 


AGTTAAAAGA 


CGGAGGAATG 


AAATTAATGG 


ATACAGCTGG 


AATAAGATTA 


4620 


IS 


ACTCCAAAAG 


AAATCGTATC 


TAAATTAAAT 


GAATACATCG 


TTGGACAAAA 


TGATGCTAAA 


4680 




CGTAAAGTGG 


CAATTGCCCT 


ACGTAATCGA 


TACAGAAGAA 


GTTTATTAGA 


TGAGGAATCA 


4740 


20 


AAGCAAGAAA 


TTTCACCTAA 


AAATATTTTG 


ATGATTGGAC 


CAACCGGCGT 


TGGTAAAACT 


4800 


GAAATTGCAA 


GAAGAATGGC 


CAAAGTTGTC 


GGCGCGCCAT 


TTATAAAAGT 


AGAAGCTACT 


4860 




AAATTTACTG 


AGGTAGGTTA 


TGTAGGACGA 


GATGTTGAAA 


GTATGGTTAG 


AGATCTTGTT 


4920 


25 


GATGTTTCAG 


TAAGATTAGT 


CAAGGCGCAG 


AAAAAATGAT 


TGGTACAAGA 


TGAAGCAACA 


4980 




GCTAAGGCCA 


ATGAAAAACT 


TGTTAAGTTA 


TTAGTTCCAA 


GTATGAAAAA 


GAAAGCGTCT 


5040 : 




CAAACGAATA 


ATCCTTTAGA 


GTCACTTTTC 


GGAGGTGCAA 


TTCCAAATTT 


CGGACAAAAT 


5100 


30 , 


AACGAAGATG 


AAGAAGAACC 


ACCTACTGAG 


GAAATTAAAA 


CAAAACGTTC 


TGAAATTAAG 


5160 




AGACAGCTAG 


AAGAAGGCAA 


ACTTGAAAAA 


GAAAAGGTAA 


GAATTAAAGT 


CGAACAAGAT 


5220 




7 CCTGGTG CTT 


TAGGTATGCT 


AGGTACAAAT 


CAAAATCAGC 


AAATGCAAGA 


GATGATGAAT 


5280 


35 


^ CAATTAATGC 


CTAAAAAGAA 


AGTTGAGCGA 


GAAGTTGCTG 


TTGAGACGGC 


AAGGAAAATC 


534 0 




TTAGCTGATA 


GTTATGCGGA 


TGAACTAATT 


GATCAAGAAA 


GCGCTAACCA 


AGAAGCGCTT 


5400 


40 


GAATTAGCAG AAGAAATGGG 


TATCATCTTT 


ATAGATGAAA 


TCGACAAAGT 


TGCGACGAAT 


5460 


AATCATAATA 


GTGGTCAAGA 


TGTCTCAAGA 


GAAGGTGTTC 


AAAGAGATAT 


TTTACCTATA 


5520 




CTTGAAGGTA 


GCGTTATTCA 


AACCAAATAT 


GGTACTGTGA 


ATACTGAACA 


TATGCTGTTT 


5580 


45 


ATAGGTGCTG 


GAGCTTTCCA 


TGTATCTAAG 


CCGAGTGACT 


TGATACCAGA 


ATTG CAAGGT 


5640 




CGTTTTCCGA 


TTAGAGTTGA 


AffTYZATAnT 


1 inl VwkjAj inVj 




AAGAATTTTG 


5700 




ACAGAACCAA 


AATTGTCATT 


AATTAAACAA 


TATGAAGCAT 


TGCTTCAAAC 


AGAAGAAGTT 


5760 


SO 


ACTGTAAACT 


TTACCGATGA 


AGCAATTACT 


CGCTTAGCTG 


AGATTGCTTA 


TCAAGTAAAT 


5820 




CAAGATACAG 


ACAACATTGG 


TGCACGTCGA 


CTTCATACAA 


TTTTAGAAAA 


GATGCTAGAA 


5880 




GATTTATCAT 


TCGAAGCACC 


AAGTATGCCG 


AATGCAGTTG 


TAGATATTAC 


CCCACAATAT 


5940 
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AAATATACAA AAGGAGAAAA ATTCATGAGC TT ATTATCT A AAACGAGAGA GTTAAACACG 606 0 

TTACTTCAAA AACACAAAGG TATTGCGGTT GATTTTAAAG ATGTAGCACA AACGATTAGT 612 0 

AGCGTAACTG TAACAAATGT ATTTATTGTA TCGCGTCGAG GTAAAATTTT AGGATCGAGT 6180 

CTAAATGAAT TATTAAAAAG TCAAAGAATT ATTCAAATGT TGGAAGAAAG ACATATTCCA 624 0 
AGTGAATATA CAGAACGATT AATGGAAGTT AAACAAACAG AATCAAATAT TGATATCGAC * "6300 

AATGTATTAA CAGTATTCCC ACCTGAAAAC AGAGAATTAT TCATAGATAG TCGTACAACT 6360 

ATCTTCCCAA TTTTAGGTGG AGGGGAAAGA TTAGGTACAT TAGTACTTGG TCnAGTACAT 6420 

15 GATGATTTTA ATGaAAATGA TTTGGTACTA GGTGAATATG CTG CTACAGT TATTGGTATC 6480 

GAAaTCTTAC GTGAGAAGCA TAGTGAAGTA GAAAnAGAAG CGCGCGATAA AGCTGCTATT 6540 

ACAATGGCAA TTAATTCATT ATCTTATTCT GAAAAAGAAG CGATTGAACA TATCTTTGAA 6600 

GAACTTGGCG GTACGGAAGG CCTATTAATC GCATCAAAAG TTGCAGATAG AGTTGGTATT 6660 

ACTAGATCTG TAATTGTAAA TGCACTACGT AAATTAGAAA GTG CTGGTGT AATTGAATCA 6720 

CGTTCTTTAG GAATGAAAGG TACTTTCATT AAAGTTAAAA AAGAAAAATT CTTAGATGAA 6780 

25 

TTAGAAAAAA GTAAAT 6796 
(2) INFORMATION FOR SEQ ID NO : 3 : 

30 " (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2073 base pairs 

(B) TYPE: nucleic acid 
<C> STRANDEDNESS : double 
(D) TOPOLOGY: linear 



20 



35 



40 



*" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

ATCOTAAAAT TnAAAATTAT CACGCCTTTT GaACAGCTTT GTAACCaTCt GGACGATCAT 60 

kAAATTCCaA TGTAAATCCT GGTTTAAaGT TGATCTTTAA CCTTATTTAA AyCACCAATT 12 0 

GTACGTATAT TATGTTGTTT AGCAAAATCA CGTTTT ACAG CTAAAGCATA CGTATTGTTA 180 

TACTTCATTG GTTTTAACAT AGTCATTTGA TATTTCTTTT CAAGACTTTG CTTAGCTTGT 24 0 

TCATAAACTT TTTTCTCTTC TTTTGACTTC AATGGTTCTT TTGTTAATTC ACCTAAAACT 3 00 

GTTCCAGTAA ATTCTAAATA CCCATCTATA TCGTCAGATT TTAAAGCATT AAATAAAAAT 3 60 

GCTGTTTTGC CCATACCATC TTTCACTTCT ACAGTATTTT TGGTCTCTTC TTCTATTAAA 4 20 

ATTTTATACA TATTTGTAAT AATCGATGGC TCGGAGCCAA GCTTTCCAGC TAACGTAATT 4 80 

v. 

TTATCACCTT TTTGTGCAAA CATAGGAATA GCGATAGCCA GTATAATAAT CATCACTATA 54 0 
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TCAAATATAA TTGCCAATAA GGCTGCTGGA ATTGCACCTA ATAATATCAA CGATGCATTG 660 

TTACGGTCTA TACCTAATAA AATTAAATCT CCTAGTCCGC CTGCACCAAT TAATG CTGCT 720 

AGTGTTGCTG TACCTATAAT TAATACCATA GCCGTTCTTA CACCAGCCAT TATAACAGGC 780 

ATTGCTATCG GAAGTTCGAC TTTAGTTAAA CGTCTAAATG GTTTCATACC TATACCTTTA 84 0 

GCCGCTTCAA TGAGTGATGG ATCAACTTCT TTAATTCCAG TATACGTATT CCTTAAAATT 900 

GGTAACAACG CATACACTAC AAGTG CAATA ATTGCTGGCA CACGACCGAT ACCAAATAAA 960 

GGAATCATTA AACCTAATAA TGCCAACGAT GGTATGGTTT GAAGAATTGC CGCAATATTC 1020 

ATTACGATTT CAGATATCGT TTTAGTCTTC GTTAATAAAA TACCTAATGG TACCGCAATA 1080 

GCAGTTGCAA TCAATAATGC GATAAATGAT ATTTGAATAT GTTCTATCAT TGTCGAAAAG 1140 

AGTTG CCCCT TACGTTGACT CAATATGTCg AAAAAGTTAG TCATGTTGAG CTACCTCCTT 1200 

20 TTTCTGGGAC AAATATTTGA AGATATCTTT CCTATCAATA ACATATTGAC CTACGCTATC 1260 

TTCTTGCATG ACAATGACAC GCTCGCTCTC TGATAAAAGT TGATACAATA CTTCAATTGG 1320 

TTGATTGTCA TAAACAATTG GATAAGCGCT CATAGATGTA ACCTCATCGA TTGGTTTCAT 1380 

AATATCCAAG TCACGGATAA TTGCGTTCTC TTCAACACAT GGCGCATCAT CTTCTAAATG 1440 

ACTACCCATA AATTGTTTAA CAAATTCACT TTGAGGATTA TTTTTAAATC CTTCTGGTGT 1500 

GTCAATTTGT TCAATATGCC CTTCATTCAA AAGACAAATC TTATGACCAA GTTTCATCGC 1560 

CTCTTGAATA TCATGTGTAA CAAATATGAT TGTCTTCTTA ATTTTAGTTT GTAATTCAAT 162 0 

TAAATCATCT TGAAGTTTTT CTCGGCTGAT TGGGTGTAAT GCACTAAACG GTTCATCCAT 168 0 

TAAAATAACT GGTGGATCAG CTGCTAACGC ACGTATAACT CCTACACGTT GTCGTTGCCC 174 0 

CCCTGACAAT TCATCAGGTT TTCTGTTTTT ATATTTTTCA GGTTCTAATC CAACCATTTC 18 0 0 

AAGTAATTCA TCTACTCTTT TATCTATATC TTTTTCTTTC CACTTTTTCA TTTGTGGCAC 1860 

TTGTGCAAtA TTTTCTTTGa wTGTCaTATG TGGGAATAAT GCAATCTGCT GcAATACGTA 192 0 

TCCAATATCC CAACkCATTT CGTATACTGG ATAATCACTT ATTGGTTTAT CTTTAAAATA 198 0 

7LATATAACCT TCACTTAAGT GAATGAGTCG ATTAATGATT TTTAATGTCG TAGTTTTTCC 204 0 

ACAACCTGAA GGTCCAATTA GCACAAAAAA TTC 2073 
(2) INFORMATION FOR SEQ ID NO: 4: 

so (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 13321 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear • 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

ACTATTCTAG CTTCATCAGT TATCATATAT TCTTTGAAAC ACTTGTAAGA AAATATAATG 60 

AGTATTTACT ACATAATGAT ATTTCAAATT AGAAAAAAGG AAGTTATGAT TTAATGGCCT 120 

TGAGCCTATC ATAACTTCCT TTTATCATTT TATTGTTGTG TTGATGTTTC GATAACGTGG 180 

TACATCTTAT CAAACATCAA TTCGAAACCA TGCACCATGG CATCATGATA TTCTTTTTTC 240 

TTTTGCTTGT ATTCTAAATT AGTAAATCGT CTTTCTTTTT CAACTAATGA ACGATAATAA 300 

AATAGCATTT GGGTGCCACC TGTTTCACGT TCAAAAAATT CTACCTCAAT GACATCTTGC 360 

GTTTCACTTA GTCCAGGCAT ACCGATAGTC ATCTTAACGT ATTCATCCAT AACTAAAGAT 420 

TCATAAATGC CTTCAATCAC ATTTACTTTG CCATTACGTT GTTGATCTAC AATACGATAT 480 

TTACCGCCTT CTTTAACGTC CGCTTCAATC TCTTTATTCG TTCTGGCTGA TGTCATAAAC 540 

20 CATTGTTTCA ACAAATCTTT CTTTGTCCAA GCTTCGTATA CTAACTCTGG AGAAAATTTA 600 

TAAAGCTTTT CAATTTCAAC TTCGACATGT TCATTCTCTA CATTAAATTT TGCCACTGTT 660 

GTCCACCCAC TTTCGCTCTT ACTTTTATTT TAACGTATTT TTGCTCAGTT CCAAACATAG 720 

25 

ATGATCATCA TTTTTAAAAG ATT AG CGTTA TACGGTGAGT ACAACATGAT CTGTTAATAT 7 80 

AACAAGCCAC CTTACTTGGC TACATCGATA TATTGTTAAG. CATTAATGTT TCATTTCTTG 84 0 

ACTAGTGTTC TTTTTTAGCT TTGGAAAATT AAATAAAATC GCAATAAGTC CGCATACACC 900 

30 . ; ^ 

TAATAATATA GGATAAATGC TGTATGGGAA TAACATTAAC GGTGAAATAC CAGCTACACC 960 

AGCCGCTGaA ATGACTTGCG GGCTATATGG TAATAAACCT TGGAAGCAGC CTCCAAATAT 1020 

35 ATCAAGAATA CTTGCTGATT TCCTTGAATC TACATCATAT TCATCTGCAA TATTTTTAGC 10 80 

TAAAGGACCT GACATAATAA TAGAGATGGT GTTGTTTGCC GTGGCAATAT CTGCGACACT 1140 

TACC&AACTA GCAATTCCTA ATTCTGCGCC ACGCTTTGAT TTCACTTTAG AGCGAACAAA 1200 

TTGCAACAAC CATTCAATAC CACCATTGTG TTGAATAATA CCGACTAAAC CACCAATTAG 1260 

CAACGCAATC ATAGCAATAT CTTCCATGCT TATAATACCT TTGGACACTG CATCTAGTAG 1320 

CCCCATCCAA CCGAATGAAC CATCTATGAG ACCAATGATT CCGGCTAATA ATGTTCCGCC 1380 

AATCAATACG ATAATGACAT TTACACCTAA TAATGCTAAT ACCAATACTA AGATATACGG 1440 

TACAACTTTA ATTAGATTAT AATCATAGT t TTTAGCATGA TTTAAAGAAA TGCCATTCGT 1500 

SO TAAGAAATAC AGAATAATAA TCGTTAAAAT AGCACCTGGC AATACAATTT TAAAGTTTAC 1560 

TCTGAATTTA TCTTTCATTT TCGTATGTTG TGTTCTAACC GCAGCAATTG TTGTATCTGA 1620 

AATCATTGAT AGATTATCGC CGAACATTGC ACCTCCAACA ACTGTAGCCa tTGCtAGCGC 1680 
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TCCTACAGAC GTCCCCATAG ATATAGAAAC AAACATACAA ATCACAAACA ATCCTACAAT ^1800 

AATTAAATTT TCTGGGATTA ATGATAGTCC TAAATTAACT GTCGACTTTA GGCCACCCAT 186 0 

5 

TTTTTCAGCT GTATTTGAAA ATGCACCTGC TAAAATAAAA ATCAACATCA TTAAAACAAT 1920 

GTTTGAATGG CCTGCACCTT TCGTGAAGAC CTCAACTTTT TTAGCAAATG ATTCTTTTCG 1980 

w ATTCATTAAT AACGCCACAA TTACCGTTAT CGTAATTGCA ACATTTAATG GCATTGAAGT 204 0 

AAAATCACCT GTG ATAATAC CTACGCCTAA AAACAACGCC ACAAATAATA ACAAGGGGAA 210 0 

TAATGCCCAA GCATTGCTCT TTTTATGTAC TTCCATCCTT TTTACCTGCT TTCCAATTAA 2160 

15 AAATACCTCT TTCTCACAAA CGATGAAGAA AGAGGTTTTC ATGTGGTTTA CCTGCTTATC 222 0 

TTCAAACCAT TACGGTTAGT GGAATTGGCA CATTCGAGAT GTTGCCGAGG CTTCATAGGG 22 8 0 

CCAGTCCCTC CACCTCTCTA GATAAGTGAT GCTTATTTAC GTTTACGTTA CAAGATAATC 234 0 

20 • ... 

CTTAGTACGT CAATCATAAA TTAATCAGGA GTCGTATAAT ATTTTTCATA AACAATCATT 2400 

GCTACTGTAA TAATAATCAA AACAATAATG CTAATAACAA GTAAAAGCCA CCATTTAAGC 246 0 

ATTAATGCAA TAAAAATGAA CACGATAGAC ACACTTACTA ATATTAATGA TATGACTTTA 252 0 

25 „ 

AATTGCTGAA CACGTTGCTT GGAGATGACT TTCAACTGTT TGTTTGATAG ACGCGTATTT 258 0 

TTTATACTGA TTCCCAGTAT ATTTTCTAAT ATTTGAACCA ATACGATACT TATTGCAAAT 264 0 

30 ATAATAATTG GTAAAACATC ATAGCTCCCT ATAGTTAATG TATAAATTAC AAATCCAATG 2700 

TAAAGTAACC CTGAGACAAA GGATAAAAAG TATGCGAGGT ATTTGTTAAA CTTAATGATA 2 760 

TGCTTTTTAA CGTTTTGATG TGTAAACCAT ACATTCGAAA CGATCGCAAC TGCTACAAAT 2 82 0 

35 ■ 

AATGTGAATA CTATATATAA TGGTAATTTT TGTTCAGGAA AAACAGTCGC TATTCCAAAA 2880 

GCTAATGCTA AAATCAAAAA TAATATAGCT CTAGATACTA TTAATGCCAT AATAACAACC 294 0 
CCTTTGTTTA ATATCGAGTT TGCAAATTTA CGTTTATCAG CGTTTCTATG ATCAGTACTT' * 300 0 

40 

CTACGGGTAG CGTTTCTATG TAATTTACAT CATCTTAACA TATAAATACT TCGCTATTTA 3060 

ATTGAAAACA TATCCTATTA TTCTTTGTCC GTTCTGACGT TTAATATCTA GCCTTAGGCA 312 0 

45 TTTCACTTGT TAATGAATTT AACTTTCTTC CACTAACCGT CCCTAAACCC AATCCCG CAA 318 0 

CAGTTTTTAA CTTTTTCGTT GTTGTCCTGA CATCCTCATT AAGAAAGTTT ATTCTG CTTA 324 0 

AAACTTATAA TCCACACCCT GAGCAAACGC TCCTTATGAC AGAGTATTAA AATAAG CCGA 33 0 0 

50 TAAAGATACA CACCTTTACC GACTATTTAA AATACACTTC ACCAATTCAT TTTAATTTAA 336 0 

TGGATTGAAG TAACTAAATT AATATTATGT TGTTCAATTA AAAGCTTCAT ACAAACCTAA 342 0 

TCTATTTGCA CTCCACCGCT AACACCGAAC ACTTGTCCGG TTGTATAACT TGATTCTTCT 34 8 0 
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GTTTTTTGAC CAAATGTTGG GATTTTACTT TGAGGTTGTC CACCAGAAAT TTGTAATGGT 3 600 

GACCAGAATG GACCAGGCGC TACACAGTTC ACTCTAATTC CTTTTGGTCC TAATTCTTCT . 3 660 

GAAAAACTTT TAGTTAATGA AATAATTGCT GCTTTTGAAG CGGCATAATC ATGAAGAATA 3720 

GGACTAGGAT TATAACCTTG TACAGATGAT GTCGTTGTAA TTGACGCACC CGGTTTTAAA 3780 

TATTCCAATG CTTTTTGAAC TGTCCAAAAT AGCGGATAGA CATTCGTTTC AAATGTTTCT 3 84 0 

GTAAATGCCT ; CAGTTGTAAA TCCATGAATA TCATCATGAT ACTGTTGATG TCGAG CAAGT 3 900 

AAAGTAACAT TATCTAAGCC ACCTAATTGT TGATATGCTT GTTCAACAAG GTCATAGTTG 396 0 

AACTGTTCAT . CTCTTATATC ACCAGGAATT AACACTGCCT TTTGACCACT TTCTTCAATC 402 0 

ACTTGGCGTA CTTCTTGTGC ATCTTGTTCT TCACTCGGAA GATAGTTAAT CGCTACATCT . 4080 

GCACCTTCTT TAGCATACGC AATTGCTGCT GCACGCCCTA TTGCTGAGTC ACCAGCTGTG 4140 

20 ACTAATATTT TATAGCCTTG TAAGCGTTGA TGACCTTGGT AAGACGTTTG GCCACAATCG 42 00 

GGTGCTGGCG TCATTTCAGA TTGTAAACCC GGTACCTCTT GTTCTTGTTT TTCATAATCC .4260 

GTTGTTTTAA ATTTTGTTCT AGGATCTTGA GCTGCCATTT TTTTACATCT CCTTATTCGC 4320. 
TTAATGGTTA TTATTTACCC AATCTTCCTA GGAACTTAAT CATGATTACA CTAAAAATTA . .4380 

CTTTCTTCTT TATAAAAACA AGCTCGAATT ATTCATGCAA TAGTCTCTTT ACAAATTCAA 444 0 

CAAAATACTC AGGTACTTTT TCCAGAATCC TTTCATCCGG TTTATATTGA GGATGATGTA 450.0 

AATCATATTC ACTATGAGAA CCAATTAACG CAAATACACT TGGAAAATGT TGACTATAAC 4560 
CTGAAAAATC TTCTCCAATC GTAAGCGGCT GTTCCATCAT . TCCCACCTTA TAT C CAACAT . . 462 0 

35 GTTGGGCTAC TGCAATTGCT TTATGCGTCA. ATGCCTCATC ATTCATCACA GCGCCAGGTA 4680 

AATGCGTATA ATTTAAATTA ATTTTCATAT TATATGCTTG AGCCAATCCG TCCGCAATAT 474 0 

CTTGTAATCG TGTTTCTACA AGCTTTCGTA CCACAGGATC AAAACTACGC ACTGTGCCTT 4800 

GTACATACGC ATGATCAGCA ATGACATTCC AAGTATTACC ACATGATATT TGTCCAATTG 4860 

TTACTACCGC TTCATCAAAC GCAGATAGAT TTCTACTAAC TATGGATTGA ATACTATTAA 4 920 

TCAATTG CGC CAACACAATA ACTGGATCGT TGCATTGTTC TGGcTTTGCA GCATGACCAC 4 980 

CCACGCCTTT AATATGAAAC TCAAAACGAT CTACTGCTGA TGTAATTGCC CCTGTTTTGA 504 0 

TTGCAAATGT ACCTACCGAA CGCGATGGGT CATTATGAAA ACCCAATACT GCTTGTACAT 5100 

SO CTTTTAATGC ATGTGTTTCA ATAATTTTAA AAGCGCCATG TCCTAGTTCT TCTGCTGATT 5160 

GAAAAATGAA TTTAACACGC CCAGTAAGAG TGCCCTCAAT TTCTTTTAAT TTTACAGCTG . 5220 

TAGCCAAAAT ACTAG CCATG TGAATATCAT GACCACACGC ATGCATAACA CCTTCATTTT 528 0 
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CAGCTATACA ACTCAGACCT TGTCCCACTT CAGCAACAAG CCCAGTCGCA AGTGGTAAGT 5400 

CTAATATTCT AATATGATGT TCTGTTAAAA TATCTTTAAT TTTTTGTGTA GTCTTAAATT 5460 

5 

CTTTATCGGA TAGTTCTGGA AATTGATGAA AATACCTTCT CCAGGTAACA GCTTGATCTT 552 0 

TTAATCCCAT CGGTCATTCC CCTTCCTTAA GTCAATGATA TGTTGTCTAC CCTACGATGA 5580 

w TCATCTTTGA CTATTAAACG ATGATTTCAC AACAATGTAC TCTTGTTAAT TGCTTTCGTT 564 0 

AATGATAGAC AGTTGTTTAA TAATATCGTA ACACTGTTGT CAAACTATTC TAACTTTTAT 5700 

AATTGAGACT CTATACAAAA ACGTGTTCTC GAATATACTT GTTTTTACAA ACCACAAAAA 5760 

15 GCTCTAAACA TTAGTTTAAA CCAATGCTTA GAGCTTTCTA ATTATTTTAT GCTTTAAAAG 5820 

ATACTGTGTT ATCTACGATG ACCTTACCGT CTTTAATAAC TTTTTCTGCG TGATTGATAC 5880 

. CAAAATGATA TGGAATATAT TCATGATTTG GTGCATCCCA AATTACTAAA TTAGCCTTAT 5940 

20 

CACCTGTGTT AATTGTACCC GCGTTAATGT CTATTGCTTT AGCAGCATTG ACCGTAACAG 6000 

CATTCCAAAC TTCATTAGGT GATAGCTTTA ATTTCAAGGC TGCAATCGCC ATAACAAGTT 6060 

GTAAGTTGTT TGTGACACTA CTACCAGGGT TATAATCAGT TGCTAATGCA ATCGCACCGT 6120 

25 

TATTGTCAAG CATGCCTCTT GCATCTGCAT AATCTTCTTT ACCTAAATAG AACGTCGTTG 6180 

CAGGTAAGAG GACAGCTACA GTATCACTAT TTCGCAACTT TTCTTTTCCT TTATCACTAG 6240 

30- AAGCTACTAA GTGGTCTGCT GATATTGCTT GTTCATCAAT TGCTAATTCC AGTCCGCCTA 6300 

ACGGATCAAT TTCATCCGCA TGTATTTTCA CTTTAAAACC TGCTTCTTTG GCTTTTTGCA 6360 

TATAATGTTG CGATTGTTCT ATTGTAAATA CACCTGTTTC ACAGAAAATA TCGGCAAAGT 6420 

3S 

CTGCATATTG TTTTACTTCC GGAAGTAACG CAATCATTTC TTCTAAAAAT GCCTCATTTG 6480 

AACTTGCCTC TTTAGGTACA GCATGAGGCC CT AGG AAAGT ATGTTTCATG TCTAAATCAT 654 0 

ATTTCTCAGC TAAACGATTA GACACTTTCA ATTG CTTCAG TTCATTTTCT CTATCTAATC 6600 

40 

CATAACCACT CTTACTTTCA ACTGCAAGCA CGCGGTGTTT AAT CAT AGT A AGCAAATCAT 6660 

GCTCTGCTTT TTTAAACAAG TCATCTTCGG ATGTTTCTCT AGTAGGATTA ACGGTAGATA 6720 

45 ATATGCCACC ACCCATTTCT AATATTTCAA GGTAAGACTT ACCTTGACGT TTTAATGACA 6780 

TCTCATGTTC TCGAGATCCA CCAAATGTTA AATGGGTATG TGCATCTACT AATGCTGGGG 684 0 

ACACTACCTT CCCACTAGCA TCAATCGTCT CAGTCGCATG GTAGTCATCT GTATGTGTTC 6900 

50 CAGCATATAC AATTTTGCCA TCTTTAATGA CAACTGTACC ATTTTTCACA ACATTTAATT 6 960 

CATCTAATTC CTTACCCTTC AAAGGTTTAT CTGTTGATCT CGGTAAAATT AATTCTGCTA 7020 

TATGATTAAT TATTAAATCA TTCATTACTT ATCACCTGCT TTAT CAATCA TTGGAATATG 7080 
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AACAC C CAT A 


CCTGGGTCAG 


TCGTCAATAC 


ACGTTCCAAT 


CTTCTTTCAG 


CACGCTCTGA 


7200 




TCCATCTGCT 


ACAACAACCA 


TACCCGCATG 


AAGTGAATAT 


CCCATGCCAA 


CACCGCCACC 


7260 


5 


GTGATGGAAT 


GAAATCCATG 


AACCACCTGC 


AGCTGTGTTA 


ATGAGTGCAT 


TCAATACAGC . 


7320 




CCAATCACCA 


ACCGCGTCAC 


TACCATCTTT 


CATACTTTCT 


GTTTCACGGT 


TAGGACTAGC 


7380 


10 


AACTGAACCA 


GCATCTAAAT 


GGTCTCGTCC 


AATAACAATT 


GGTGCTGAAA 


TTTCACCGTC 


7440 


ACGTACAAGA 


CGATTTAAAG CTAAGCCCAT TTTCGCTCTT TCTCCATAGC CTAACCAAGC , 


7500 




AATACGTGAT 


GGTAGTCCTT 


GATATGAAAT 


TTTTTCTTCA 


GCTAAATCAA 


GCCATCTTAA 


.7560 


15 


TAACTTTTCA 


TTTTCTGGGA 


AAAGTTTGCG 


CATTTCTTCA 


TCCGCACGCT 


CGATATCTTT 


7620 




TGGATCACCA 


CTCAACGCAG 


CAAAGCGGAA 


TGGCCCTTTA 


CCTTCACAGA 


ATAATGGTCT 


7680 




AATGTAAG CT 


GGTACAAAGC 


CTGGGAAGTC 


AAAAGCATTT 


TTCACTCCGT 


TATTGAAGGC 


7740 


20 


TACTTGACGA 


ATATTGTTAC 


CATAATCAAA 


TGCTACAGCG 


CCACGTTTTT 


GGAATTCAAG 


7800 




CATTAATTCA 


ACATGCTTTG 


CCATTGAAGC 


TTGTGACAGT 


TCAACATATT 


TTTTCGGATC 


7860 


25 


TTTTTCACGC 


AATACTTTCG 


CTTCTTCTAC 


AGAGTATCCT 


TGTGGCACAT 


ATCCATTTAG 


- 7920 


CGGATCATGT 


GCACTTGTTT 


GGTCAGTAAT 


AATGTCAATT 


TTAAATCCTT 


TTTCTAGAAT 


7980 




CGCTTGATGG 


ATGTCTACAG 


CATTTCCAAC 


TAACCCGATT 


GATAATCCTT 


CTCCACGTTC 


8040 


30 . 


TTTCGCCT CT 


TCTGCTAATT 


TTAATGCTTC 


ATCTAAATCA 


GCTGTTTTAA 


CATCACAGTA 


8100 




TTT CGTAT CA 


ATTCGCTTAT 


CAACACGTGT 


TTCATCAACA 


TCCACGCAAA 


TTGCTACCCC 


8160 




ATGATTCATA 


GTAATTGCTA 


ACGGTTGCGC 


ACCACCCATA 


CCACCTAAAC 


CTGCTGTCAG 


8220 


35 . 


. TGTAACAGTG 


CCTGCTAAAT 


CTCCATTAAA 


GTGTTGATTA 


CCTAGCTCGG 


CAAATGTCTC 


8280 




ATAAGTACCT 


TGCACAATAC 


CTTGAGAACC 


AATATATATC 


CAACTACCGG 


CTGTCATCTG 


8340 


40 


TCCATACATG 


ATTAAACCTT 


TTTTATCTAA 


TTCATTAAAA 


TGATCCCAGT 


TTGCCCATTC 


8400 


AGGCACTAAT 


ACTGAATTTG 


AAATTAATAC 


ACGTGGCGCT 


TCTTCATGTG 


TTTTAAATAC 


8460 




AGCAACTGGC 


TTTCCTGATT 


GTACTAACAT 


TGTCTCATCT GATTCTAATT 


CTCGTAACGT 


8520 


45 


TTTCTCTATT 


GCTTCAAAAG 


CTTCCCAATT 


ACGTGCTGCT /TCTCCAATAC 


CACCATAAAC 


8580 




AACTAAAT CT 


TCTGGTCTTT 


CAGCAACTTC 


TGGGTCTAAA 


TTGTTGTATA 


ACATTCTAAG 


8640 




TACTGCTTCT 


TGTTCCCAAC 


CTTTACACTC 


AATACTCAAA 


CCTTTTTTTG 


CTTGAATTTT 


8700 


50 


TCTCATAAAA 


TTCGCTCCTG 


TTCTTTTAAG 


AAGTTAATTC 


CACTAAATTT 


AAAACGCTTA 


8760 




CATTATTATC 


TTCAATATTC 


ATTATAGTAT 


GTTAAAATAT 


AGCCAACAAA 


TATAAATAAA 


8820 




CTAATTATCC 


ATAGCTTGAA 


TCTATAAATA 


AAAGGAGCAA 


AACACATGAA 


AATTATTCAG 


8B80 
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CATATTAGCC AGCCATCTTT AACTGCTACG ATTAAAAAAA TGGAAGCAGA TTTAGGTTAT 9000 

GACTTATTTA CACGTTCAAC AAAAGACATC AAGATTACCG AAAAAGGAAT ACAGTTTTAT 9060 

s 

CGTTATGCGA GCGAATTAGT TCAACAATAT CGATCCACGA TGGAAAAAAT GTATGATTTA 9120 

AGCGTTACAT CAGAACCAAG GATAAAAATT GGGACTCTTG AATCTACGAA TCAATGGATT 9180 

10 GCGAATTTAA TTCGAAAGCA CCATTCCGAC TACCCTGAAC AGCAATATCG TTTATATGAA 924 0 

ATACATGATA AACATCAATC TATAGAGCAA TTACTGAATT TTAATATTCA TTTAGCTATA 9300 

ACAAATGAAA AAATAACCCA CGAAGATATA AGATCCATTC CTTTATATGA GGAATCTTAC 9360 

15 ATTTTATTAG CACCCAAGGA AACATTTAAA AATCAAAATT GGGTAGATGT TGAAAATTTG 9420 

CCACTCATAT TACCAAACAA AAATTCTCAA GTGCG CAAAC ACTTAGATGA CTATTTTAAT 9480 

AGAAGAAATA TTCGTCCAAA TGTCGTTGTA GAAACAGATC GATTCGAATC AGCAGTTGGA 9540 

20 - 

TTTGTTCATC TCGGCTTAGG TTACGCTATC ATTCCGAGAT TTTATTACCA ATCATTTCAC 9600 

ACGTCTAATT TAGAATATAA AAAAATTCGT CCAAACTTAG GCCGAAAAAT TTATATCAAT 9660 

TACCATAAAA AACGCAAACA CTCCGAACAA GTACATACAT TCGTACAACA ATGCCAAGAT 9720 

25 

TATTTATATG GA C T TT TAGA GGCTCTTTAA CTTAAGTTAT TAGAGCCTCT TATGCAGTTG 9780 

CTCAGTCAAC TGTATACCTT TTGCCTTTAA CTTAAGTTAT TAGAGCCTCT TATGCAGTTG 9840 

30 CTCAGTCAAC TGTATACCTT TTGCCTTTAA CTTAAGTTAT TAGAGCCTCT TATGCAGTTG 9900 
CTCAGTCAAC TGTATACCTT TTTCCTTTAA CTTAAGTTAT TAGAGCCTCT TATGCAGTTG . 9960 

CTCAGTCAAC TGTATACCTT TTGCCTTTAA CTTAAGTTAT TAGTGCCTCT TATGTAGTTG 10020 

35 

CGTAGTCAaC TGTaTACCTT TTGCCTTTAA CTTAAGTTAT TAGAGCCTCT TATGCAGTTG 10080 

CGCAGATCAT CGTATAAAAA TTAATGACGT CATTTCAAAA ATCGATACAA AAATAATTTA 1014 0 

TTATAAAAAT TCTAAGAAAG AAGTGAAGCA GATGTTAAAA TCTATTAATC ATATATGCTT 10200 

TTCAGTCAGA AATTTAAACG ATTCAATACA TTTTTATAGA GATATTTTAC TTGGGAAATT 1026 0 

GCTATTGACT GGTAAAAAAA CTGCTTATTT TGAGCTTGCA GG CCTATGGA TTGCTTTAAA 1032 0 

45 TGAAGAAAAA GATATACCAC GTAATGAAAT TCACTTTTCA TATACACATA TAGCTTTCAC 10380 

TATAGATGAC AGCGAATTTA AATATTGGCA TCAGAGGTTA AAAGATAATA ACGTGAATAT 10440 

TTTAGAAGGA AGAGTTAGAG ATATTAGAGA TAGACAATCA ATTTACTTTA CCGACCCTGA 10500 

50 TGGTCATAAG CTAGAATTAC ATACTGGCAC ACTTGAGAAC AGATTAAATT ATTATAAAGA 10560 

GGCTAAACCA CATATGACAT TTTACAAATA AGGTGTCATT ATAAAAAGGC CTCTTGAACT 10620 

CCGTTAAAAT TTTAATTAAT TATTATATAA TAAGAGAACT TTTCAAACAA TACAGTTGTT 10680 
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TTACTGGAAT 


TATTTTTCAA 


ATATATCAAC 


GTTAATATAA 


CTTCTATTAA 


GAAATACTCA 


10800 




CATTCTGCCC 


TGCAATGCAA 


ATCTCGTCAC 


ATATAAATAT 


TTTTAATTAT 


TTTAAAAAAT 


10860 


5 


GATGCACTAA ATTAGCAACG 


AGCTTAGCAG 


TTCTATTGTC 


AGCGTCATAT 


GTTGGATTCA 


10920 




TCTCAGCAAT 


ACTAACTGAA 


GACACCTTAT 


CACTTGGAAT 


AATACGTTTT 


GCTAATTCAA 


10980 


10 


GAACAGTATG 


TGGATACAAA 


CCTAACACTG 


CCGGCGCACT 


TACCCCAGGC 


GCAAACGCAC 


11040 




TATCAATGAC 


ATCCATACAA 


ATCGTAAACA TAATGACATC 


ATGTTCATGT 


ACAAAACGTT 


11100 




CAATCATATC TTTAATTGTT GGTGATACGT GACTCAATAA 


TTCATCTGCA 


AAGACATAAT 


11160 


1$ 


CAATCTTTTT 


CTCTTTAGCA TAATCAAATA AACTTTGCGT 


ATTACCACCT 


TGAGCAATAC 


11220 




CAAGCACTAA ATAATCTGTG 


TTTTCATCTT 


CTTCTAAAAT 


TTGTCTAAAG 


CTCGTTCCAG 


11280 




ATGTAGATTG 


TTGTTCAGCA 


CGTGTATCAA . 


AATGCGCATC 


AATATTTATC 


ACACCAATAG 


11340 


20 


ATTGTGTTGG 


ATAGACTTTA 


CGTGTTGCTA AATATTGAGC 


ATACGCAATA 


TCATGTCCAC 


11400 




CACCTAATAA AAATGTTTGT 


CTATGATTAG 


CAATTGACTT 


CGCTGCAAGC 


ATAGCAAATT 


11460 


25 


CTTTTTGAGT 


ATCAATTAAT 


TCCTCATGAT 


CATGATAAAC 


ATTTCCGTAA 


TCGACTAAAG 


11520 


TTcACATTGA 


TTCAAATCCG 


GCAAACCTGC 




TTAATCG CAT 


CTGGTCCTTC 


11580 




TTTTGCACCA 


ATGCGCCCCT 


TGTTTAAAGC 


AACACGTTTG 


TCAACAGCAT 


AGCCTAATAT 


11640 


30 


ACCGACCCCT GATGGCATAC 


TACTCTTTTC 


CAGCTTAGAC 


AAATCTTCAA 


ATGTTACTGT 


11700 




TTGAAAATGT 


ctaaattttt 


TCGGGTCTGT 


TTCACTATCT 


AACCTTCCAG 


TCCATAAATT 


11760 




TGGTTCACCT 


TGCTTGTACA 


CAGCATTTCC 


CCCTCTTATT 


TATGTGGCTT 


ATTAACAATT 


11820 


35 


AAAGTATAAC 


GTATAGGAAA 


TTTTGAATTC 


AATTCATAGT 


TAAATCCGTA 


TCTTAAAAAT 


118 80 




ACTTATCTAC 


ATTACTTTTA 


CCCCTATTTT 


CTATGTAATA 


ACGAATACTT 


AGCTGATTTA 


11940 


40 


TGTTAATAAA 


ATACGTCAAG 


ACTATTACAT 


TTTCATTAAT 


ATTGACATAG 


ACAATTTATC 


12000 


TCTCGGCTTG 


TAATATGTAT 


AATTGTTACT 


AAAAGATATT 


TTGCTTGTTA 


C CTAATGG AG 


12060 




GTTACATATA 


ATGAAGAACA ATAAAATTTC 


TGGTTTTCAA 


TGGGCAATGA 


CGATTTTCGT 


12120 


45 


CTTCTTTGTC 


ATTACAATGG 


CGTTATCCAT 


TATGCTCAGA 


GATTTCCAGT 


CTATAATTGG 


12180 




TGTCAAACAC 


TTTATATTTG 


AAGTTACAGA 


TCTAGCACCA 


1 Innl xv?v~i.Vj 




12240 




TATACTCGTT 


TTCAAATATA 


AAAAGGTCCA 


ACTTGCAGGT 


TTAAAATTCT 


CAATCAGCCT 


12300 


SO 


GAAAGTAATT 


GAACGTCTAT 


TGCTAGCTTT 


AATTTTACCT 


TTAATTATTC 


TAATTATTGG 


12360 




TATGTACAGC 


TTTAATACAT 


TTGCAGATAG 


CTTTATTTTA 


TTACAATCAA 


CAGGCTTATC 


12420 




. AGTACCTATT 


ACACACATTC 


TGATTGGACA 


TATTCTGATG 


GCGTTCGTAG 


TAGAATTCGG 


12480 
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TGTTGTTGGT 


TTGATGTATT 


CAGTTTTCTC 


AGCAAATACA 


ACTTATGGTA 


GAGAATTTGC 


12600 


TGCTTATAAC 


TTCCTTTATA 


CATTCTCATT 


CTCTATGATT 


CTTGGTGAAT 


TAATTAGAGC 


12660 


GACTAAAGGA 


CGTACAATTT 


ATATTGCAAC 


GACATTCCAT 


GCTTCAATGA 


CATTCGGACT 


12720 


TATTTTCTTG 


TTTAGCGAAG 


AAATCGGCGA 


TCTATTTTCA 


ATCAAAGTCA 


TCGCCATTTC 


12780 


AACAGCAATC 


GTTGCAGTAG 


GATACATTGG 


TTTAAGCTTA 


ATTATCCGAG 


GTATTGCATA 


12840 


TTTAACAACA 


AGACGAAACC 


TTGAAGAACT 


TGAGCCTAAT 


AATTATTTAG 


ACCATGTCAA 


12900 


TGACGATGAA 


GAAACTAATC 


ATACTGAGGC 


TGAAAAATCT 


TCTTCAAATA 


TTAAAGATGC 


12960 


TGAAAAAACA 


GGTGTAGCTA 


CTGCATCAAC 


GGTTGGTGTT 


GCTAAAAATG 


ATACTGAAAA 


13020 


TACAGTGGCT 


GACGAACCAA 


GCATTCATGA 


AGGTACTGAA 


AAAACAGAAC 


CTCAACATCA 


13080 


CATAGGTAAT 


CAAACTGAAT 


CTAATCATGA 


TGAAGATCAt 


GACATCACTT 


CGGAGTGAGT 


13140 


AGAATCAGCm 


GaATCAGTTA 


AACAAGCACC 


ACmAAGTGAC 


gATTTaACAA ACGATTCAAA 


13200 


TGAAGATGAA 


ATAGAGCAAT 


CATTAnAAGA 


ACCTGCGACT 


TATAAAGAAG 


ACAGACGTnC 


13260 


ATCAGTTGTA 


ATTGATGCAG 


AAAAACATAT 


CGAAAAAGCT 


GAAGAnCAAT 


CTTCAGATAA 


13320 



25 

A 13321 
(2) INFORMATION FOR SEQ ID NO: 5: 

30 (of) SEQUENCE CHARACTERISTICS : '* 

(A) LENGTH: 8 54 9 base pairs 

(B ) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
ATGTGTTGTA AACTTTTATG TTGAAAAAGC 

40 

AATAAATTAA TTAGTATACA GCTAGTTTTT 

ACCGTATTAG AGAGGCAGAT TGATCCATCG 
45 AGCCATTACA AACAACTTCA AACTGTTGTG 

TGTTTAAAAT AAACATATCG TCATAATGTG 

AAACGTATAA AAACCTTGTC ATATCAACGG 
50 ACGCATGTTT GCCCTTATTT AAATAATTTG 

AAAAATAACC ACACTCCTAA ATTAATAGGT 

AAATAACCGC ATTATTAAAG ATACGGTTAC 

55 



TACTTATCTC AATGAAAACA AGTAGCATTT 60 

CTAATTGTTC TTTAACTTGA ATTAAGTTTG 12 0 

TTTGAATTGC TTGTCCTTCA TTTTCGTTCA 180 

CCATTTGATC AAGACGCGCA TGAGCTTGTG 24 0 

ATGGCGAATA GATAATTCGT CGTTGTATAC 30 0 

TTTTGGCATT TTTAAACCTC TGTGTTTTCC 360 

CCCTTTTTTC GCCCCGAAAA AAAAACACAA 420 

GGTGTGGTTT TGTTGATTGT AGGGGTATAA 4 80 

TCTGTTATCT GTAAATATAA TAGTAGTTTA 54 0 
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AAACAGGACT CCACATAAAA ATCAACTCCT TTATATACCA TAATGATACT ATATTTTCTA 660 

GTTTATTTCA ATTTTTCAGT TTTTAAAAAT GAGTTTCTGT TTTTATTTAT ACGCTTTTCT 720 

GTTTTCTTTT TAAATTTTAT CTTTTTGTTA TTCCATTCAT TGTAAAATTC TATTAAATTA 7 80 

ACATAAAATT TTTCATGCCC TATTTTATTT GTTGATGAGA TATCAATGTA AAGACTCAAT 840 

ATTGTTTTTA AATAGATTTG ATGCAACGAC TGATAAACCG TATTACTATC TGCTATGTTA 900 

TTGGTAAAAT GCATAGAAAA ATATTCTAAT TTATTCATGC AATATATATG GGTTTCATTA 960 

TACTTCTTAA TGAGTGTATT TATACCTTGC AATACGTCAT TACTTTTAAT AAGAATTTCT 1020 

15 TTTTCACCTG TCGAAAAAGT CCACTGTTTA TCTCCTATAT TTTCTTTAAT TGTTTTCTTG 1080 

TTGTCAAATT CTAAAATTAT AGCCCGTAAA CACTCTTCTT TATAATTCTC GTTCTTGAAA 1140 

GTACGAAG CA AAATTTTTAT AAATTCGGTA TTGGTGACTT TTTTATAAGT GTGATATTTT 1200 

GCAATCTCTT TATCAGTAAA GACTGTTCTT AGTTCGTGAT TATCAAAACT TAAATTCATC .1260 

TTATTCTCTA ATTCATTAAT TTTATCTTGC AAACCAACAT TTTCTAAAAT TTTCTTGTTT 1320 

ATCTCCCCTA TATCAAAACT CCTTTTCGAA ATTAATTTTG AAAACTCGTC TGCCATTTCA 13 80 

ACAGCCTTTT CTTTCCTTTT ATACCTTTTG TTAAATTTAT GAACCACCGT TGCAGCATAA 1440 

TACGATATCC CACCAGATAA AATAGATGaT ATTATCGGTA TGTATATATC ACCTTTCATA 1500 

30 TTTCCACCTC TTTTAACACA ATTAAGTATT ATGATACACA ! ACTTGCGCAA ' AAAGATGTAG 156 0 

ACAGAACATA ATGGCGAACA AAAACAACCA CCCAGTAACT AGTATGGGTG GCGTAgACTA 1620 

TAACAACTCT ATGTTATCAA GATATATGTA TCGAGTGATG- GCAAGGAAGA AGTCTCCTGC 1680 

35 GGGACCAACA GTCAGATATA TGGCCTCTGC CGGGCTATAT AGTTCACTCC TACTATATAA 1740 

AAGTAAGTAT AACATAAAAA GCACCCCGTA AACTGTTATA CGGGAATGCT AAAGTCATAT 1800 

ATACTACGGG GAGTAGTATG AAAACTATGC TCTCTATCGT AAGAAAAAAC ACCCAGTGAC 1860 

ATGCTTGGGT GAACAAGGAT AGATGTAAAT AGTTGATGCA TGTGTAcACA TCATAACAAA 1920 

AAACTAGCCC GAAGcTAGCT ATAACATAAA AAAATAGGCA AGTACCGAAG TACCTGCCAG 1980 

4S TTACGCACAT TTAAATCTTG AGAGTAATGT TAAAAAGTGT ATAGGAATAT TAACATCCAT 2040 

CCAAATAGTT ATTTAATAAC TGTAAGATTC CCTATAATTA ATGTAGCaAA ATTTTTATTC 2100 

TAAGTAAATA CTAAATCGTG CTAAACTTAC CAAAACTACT TATTCTATTA CCTGCCTTGT 2160 

50 CTACCTCTCC TGTCGCTATA TAACGACGTT GTCCACTATT AGCAATATAA GTAATCCATC 2220 

TATAGCCATT GATGCAATAT GCGCCGTCAT ATTTAATTGT TGCGTTATTA GGTAATACAC 2280 

CTGTAATTCT TGAATTAGTT GAATAGCCGT CCCTTACGTT ATTACCTTTA ACATTGGCAA 2340 
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CTGGCACTGG TGGATTTTTT TGGTTTTTAG CTGATGTTTT AACATTACCA GCTACCAAAC 2460 
CACCTATAGG CTTACCATGA ATCGCACCGG CTATTAATTT AGAATACAAG TCATAGTTTT 2520 
TCTTAATCCA ATCCATATCA TTTTTATTAG TAATAAAACC TAATTCAGAT AAACGATAGT 2560 
TTATATTTAT TTCTGCTGAT ACATTAACGT TTAGTAAATC ATTACGAGGT GTTACACCTC 264 0 

TTATTTGTCC TAAGTTATTT TTAATAACAT CTTGTATACT TTTATCAATA GTATCTGCAT 2700 
TGAATTGACT TGAAATAATA ACATGCCCAC CACTTGCACT TTCTCCTGCT GCGTCTAAAT 2760 

GAATCTCTAG AACAATGTCA TACCCATGTG ATTTAACCCA ATATAAGCCA TAAT CTTTAT 2820 

15 TATTTCCTAC ATTAACACCG TAAGCAGTAT CTTGATACAT ATCTTGTGAT TGACTTGAGC 2880 

GACCATATAA TGCAACTTCG TGACCTGCAT GTCTTAAATA CTTAGCGATA TTTGGTGTTA 2 940 

TATATTTACG GATAAAATCA CGTTCATTTG TTCCGTTTCG GACTGCTCCA GGATCGTTAT 3 000 

AACCATGACC GGCTACAAGC ATAATTTTTT TAGGTTTAAT TACTGCTTGC TTTTTGGCAG 3060 

TTGCTTGCTT AATAACGCTT TTAG CTTTAT CTCCAACACT TACTTTATCT GGGAAATTTA 3120 

ATCTAATAAA ATACATTGGG TCATCGTAAT AATGAACATG TCTTGTAACG GTTTCGGGAC 3180 

CCCAACCAGG TTGCG CAACG CCATTTGTCC AACCTTTACC ATTCCAATTT TGGCCAAACG 3240 

ATGTGAAAGT GTTTAGATTA GCGCTCTCAA CAATTTCAAC ATGTCCaGct CCGCCACCAT 3300 

ACTTTGACGG GAAAACGACA ATGTCCAACT TTTGCGGTAA AAAGCTATCA TAGTTTTTAA 3 3 60 

TTATTTGCCC GTATTTTTCA ATCCTTGCTT TATTATCAAA TGGAATATTA TAAGCGTATA 3420 

AACCTTGTAA CcTTTCGCCT GTTGCTATCA TAAAAAACAT ATTTGCGTAA TCGTAACACT 3480 

55 GAAATCCATA AAACAAATCA GGATTGAACT GCTTCCCTAA TGAATTATCA AACCATTTTT 3540 

CTGCTTGGTT TTTTGTTATC AACATTGGTC AACACCTACG CTAAATCATT TGTGTCGTTC 3*600" 

ATATTCGTAG GTGTCATTAC TTCTTTAATT GGCGCTTGCC CTGTTGCTTT TCTATACTTG 3 660 

40 ' ' 

TTTTCAGCTT TATATTTCTT TAGCTTTTGA TTTGCCCATT TACCTTCTTG AGATGTTGGA 3 720 

TTATCTTTAT ATGTAGTATA TAAAGCAACA AGTGTTAAGA TAATCGATGA AACACTTTCT 3 780 

4S TCATCTACTG GTATCGGACT TATACCTTTA TTCGCTAAAA ACTGATTGAC TAATGCTAAG 3 840 

ATCAATACGA TGTATCTTGT TATTACTTTT GCATCCATTT GTTTGCTCCT TTTATCCAAA 3 900 

ATAAAAAGCC AGTGCCGAAG CACTGACTCT TAACTATTAC TTACAGTTAC TAAACCAGAA 3960 

60 ACACGACCAA AAG CTATATC CTAAAATTCC CTTAAGCATG GTAATCACCT CCTTTAAATG 4 020 

CGAAAAATAG TTTTTAACAA GGCTATAACA AATGTACTTA GAATCGTCCC TATTAATCCT 4080 

AGAATCCACA TCTTGATGTC TCTAATATTT TTAGCATTTT TCTCTTTATT TTTTTCATCT 414 0 
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TGCGTTCTCA GACTGTCTTC TATTCTGTCG AATTTTTCAA ACATAGTCTT ATCATTTTCT 426 0 

TCTAATCGCG TTAAACGCCA ATCTTGTTCG TGTCGTTTGG TAAATCCAAA CATTACACCA 4320 

CCCACTTTAT TCAAATTAAA AAGCCATAAG ATTATAACCT ATGACTCTAG ATTTTCTGGA 4380 

TACTTTTCTC CTGTAATAAT TGCATATTCC TCTTTATCTA TAACTTCCAT ATCTACATAC 4440 

CACGCTATAT CTTCTTTAGT ATATTCTTTC AATTGATACC ATGTTTTAAT ATCTTCGAAT 4 500 

GTTGGTGAAA TTAATTTAAG CATTTTCAGT CTCTCCTTTA ACCTCTTCTA ATTTTTTATT 4560 

AAGTGTCACA AGTTGTTTTG CCATTAGTGC ATTTTGCTTA TTAACTTGCA TCGATAACTT 4620 

15 TGTACTTTGA ACAACTTGTT TCTGCATACT AGCAACCATT TTTCGTAAGA TGTCATCAGA 4680 

AGCGACTGTG TTTTGTTCTT CACTGTCAAT CTGTTGATGC AAGTCATCTT TTTCTTCTGA 4740 

ATAATCTTCG TTAAAAACTA TTTCCCCATT TGAATATTTA AAGGCTTTAG GTCTAAAAAC 4800 

TTGAGAGAAA TTTTCTGGTA AATTTTCAAT ATCAATACCT TCTTCAAAGC CACCAATGAT 4 860 

AGCGTATGAA ATTATCTGAT TACGCTTGTT AACTAATATT TGCATTATTT TCTCACTCCT 4920 

ATAATTTTGT TAATTGTCCC TCTATTTGCG TTCGCACCAG AGCCTCTTTG ACTTCCTAAG 4 980 

TCGAAATAGA CATCGTTTGA TATAGTTAAA GATGTACGAC TAGATTTAGT TAATCCAAAC 5040 

TCATAAACAC CTCCACCATT TCCATCACCA TCTGGAAGAT TTGAGGGATT CAATGAAATC 5100 

TTTCCTGCTC CAAAAGGACT GCCAAACTCT GTAAAGTCAC CACCTGGAAA AGTCCCATAA 5160 

AAAATTAATA AAATAAATTG GTCTAAACTC TCATTTAAGT ACAATGTAGA GCCCACACCA 522 0 

TTTGCTGTTC CATCAAAAAT AACCGAATAC CTTTTATTAA ■ ACTTGTCATC TGCGTATAAT 52 80 

35 TTAGCGTTAC TTTCGGCCAT ATTAGCTTTT GATTGGGCAC TTTGAACAGT- TTCAAAAGGT 534 0 

GTATTGTAAT CATTAATAGC TAATTCTGAC CACTCAGACC ATGAACCCGC TTCTTTTCTT 54 00 

TTAACAAATA CTTTATTTGT ACCGTTCGGT CGATAAGTCA TACGCTTGTA ATCTGAAGTT 5460 

ACTACTAAAT ATTCGACAGT ACCGTTAGTA CTAACACCTC TTGGATAATT TATAGCTTGC 552 0 

GAAACATAAA TAAATTGGGT TGAATCACCT ATTCTTTGTT CTGGATTATT AAAATCAAAT 5580 

CCAGTAATCT GCATTATCTT ACCATCATCT TTAGTAATCT TAGCTTTTTG CCAATTTGAA 564 0 

GTAGAACCAC TTGTGACTAA ACCACCACTA TTCACTGACT GCTTGAAGGC TTCATGTTTC 5700 

TCATCCATAT ATCGCTTTTG CTCATCGAAT GTTCTTGAAT ATGCTTGCGC TTTATTTTCC 5760 

50 AAATCAGATA TATGGCTATT AGCAAGTTGC TTTAATTCAT CTATACTTGA AGATTTTGCT 582 0 

ATTTGAATAT CTGATAGACC TTTTTCTTTA GCTTTTTCAA TCAGACTCGC ATAATCTTCA 5880 

CCATTTTTTA TAGCCTCGTC CATTGCTTTC GCACGATCCA TAATAGTTTT TTCTAATTCC 5940 
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TCAACGTTAA ATGTGATAGT TCTCTCGACA ACTACCACGT CTGAATTACC TAATTCTGCA 6060 

ACCGAAACTT GAGCTTGATA ACTTCCATCT CGTTTAATTA CATCATTAGG TAATTGAAAT 6120 

TTTAAAATAC CTTTAAATGG ATCTAATATT TCTAGTGGAG CAACTACCAT GACTCCTTTA 6180 

CCTCGAATCG CTATTCGTGC kTTGATATTT tCTTCACTCA ATAATAACGG TTGATTATTT 6240 

TTAGTGATAT TAAAAAGAAG AACAGAAGAA TGACTCTCTC CTGTTCTAAA AGTTATATCT 6300 

AGATTTGAAA TATTTCCATA ATGCGCTGTG TTTTCTAAAT TTATAGCTAC AGATTTCTCT 6360 

AAATTACTCA TTAACTTATA ATTCTCGCTT CGTGTAAAGT CCATGGCCCT GAACTTGTTT 6420 

15 TACTATCATA ATTTTTCAAT AGTATCTCAG CAGATGCTGT AACACTATTA CGAACTAGCC 6480 

TATGAACAAA GCCACCTGTG TTTGAAGCTT CTACATATAA GTTCCAACCA GCTACCCCTT 6540 

TACGTTCAGT TGGAAAATCT GTAAAACGTT TTGTATCATC CGTAGTTAAA TAAAACGACA 6600 

TGCCTACTAT GTTAATATCT GACATTTTTG TGATGAATGA AGGTACTCTC TCCCATTTAC 6660 

CACTATTTTT AGGCACATAA TTCCAGTCCG AAATGTCTCC AGTTCTTCCA GAAAGCACCC 6720 

TTTCAAAAGT CATCATATTC CTTGCATAAC TATTACGCGT CAATATCTGA ATTACATCAC 6780 

CGCCAGTTTG TGGTGGCTTA ACTTCCAAGA ACCAACCTGC ATCACGCCAT TCTCTTGGTA 6840 

ATGGGAAATC ATCGATTTGA ACTGTATGAT CAGTGTATAA ATAGTAAAGA CCTGGCTCTG 6900 

TTAACATCCC AAGATTCTTA AGTTTATCAG GCCTCATTGG TAAAGGTTTA ACTCTACCAC 6 960 

CTGTGTCACT CaTGATAAAA GGAACGCCTC TTGAGTGAAG TATTTCTAAA ATACCTCTTT 7020 

GCCCAATCAT GAAAATACGA TGTGTTCTAT TTCCaTCACC ACCGACAGTA ACACCTAGCA 7080 

o 

35 TCAAAGCTTT TTTACCACTA TCTTTGTCAT AGTATATTTG CAAACCTTtC TgCTTCCGCA 7140 

AATTCGC CAG GAAATGAATC tAgTGTTCCA CCATAGTCAG CATTAACCTG ATACGCTTCT 7200 

TCTCeTGTTT CTAAATCGAA AGCCGTTAAA TAGTTTCTAT TATTTGGATT ACTGTCTCCT 72 60 

GTATACCAAT ACAAGTATTT TTCATCAAAA GTCACACCCT G CATTGGTTG GGTTTCGTTT 7320 

GTTAGTCTCA TAGGGATACT GATTTTATGC AAAACTTTAT CAATATTTTT ATCAACATCG 7380 

TCTAAACTTC TTATCTCTAT ATAAnTCATT GAGTTTTCAA GTTCCCACTG ACTTCTAGGT 744 0 

CTCTCaATTC TGTATAGAAT TTTATTTTCT TTTTCATTTA TGACAGGGGT GATGTAGGGT 7500 

TTTTCTGGGT GTCCTGTAAA TACATCTTGC ATACCATACT TGCCATAGCT AATTTCCACA 756 0 

60 TTAGG CGT AT ACTTGAAACG AACTAATGTA TTCTCATTAT TACCATTTAA GATAAAACTA 7620 

TAAATCCATA ACTCATcATC AATATATCTA TAACCGTTAT GTGTACCATG ACCCCCACCT 7680 

ACAATCAATG AGCTGTCTAT AAATTGACGA TTAGGTCTTA GACGACTTAG CATATAGCCA 774 0 
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ATTACTGCAT 


TTGTAAaAGG 
j> ivi rtrvy i»vjv? 


TGCAAGTTCT 


GTCACAAATA 


AAAATTCTTG 


CTTATCAGGT 


7860 


TCAAAACGAT 


ACTCGAT AT C 


AAGAATTTCT 


TGTTTGGTCT 


TATTTAATTC 


TCTTATAGTT 


7920 


TPCTCITTAT 

X A W XXX X% X 


TAATTTGAGT 


TTTGGTTTCC 

AAA V»*>J AAA H^^tf 


CAATCGTCTA 


AATGTTCTTT 


TAATGTGTCA 


7980 


AAGQ'l'lTCG C 


PrSTPTT AP A 1 1 1 L 


AACTCGAGCT 


TGAACAATCT 


CATTAGCACT 


GTTATTACGT 


8040 


f2/vmpPAPAA 


P A AGTG rOTT 


AATTTGACTT 

^^^^ A A X w/lw A A 


TGTAAAGATT 


TGTTTACTGC 


TGCTTGCGAT 


8100 


PT* 21 PP 21 TT 


f\e\ X rirt/i X x xv3 


PTdAGCGAAG 


TGTTGAATTG TTTTAGCTyT 


CTGATGCAAC 


8160 


X lAAAv-itlVj 




A A A AT 


TGCTCTATTC 


TTTGTAAGTT 


TTGTATTTCC 

A A\JAf*A A AW 


8220 






TY2 PT A A AG PT 


CCCAAATCCT 


TTATTAAATA 


CAAATTTTCC 


8280 


ATAATGCACC 


TTCCTTTCTA 


ATAAAATAGC 


ACTGTACCAA 


GTTTCCCACT 


ATCGTCAACT 


8340 


GTTATTTT CC 


ACAATTTACC 


GTTTGGGGAT 


TTCTGTACAA 


TGCTATTTTG 


AATAATTgcC 


8400 


TGCtTCGCCT 


ATTTTTAAAT 


TAT CT AATTT 


ATTTkTATCA 


TTTACCGAAA 


TGATACCGTC 


8460 


TTGAGGCAAT 


CCATCAATAn 


CACTACTGCC 


TG CAT AAGGT 


ATCCCATTTA 


TAGCTTTCCA 


852 0 


ATGTGTAGCT 


GGAAAGTACT 


GTTTATCGT 








8549 



25 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3601 base pairs 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



35 ' (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

AGGCGTGTAG TGACTTACGG nTAGGAAACT 
AGGCATTAAA GTC CATTGAA ATATCnGGTA 

40 - 

GATGTATGAG TCAACCATTA TTCAGAGAGG 

GGTGAAAATA ATGACAGAAA TTCAAAAACC 
45 AGAAAGTGAT TTTACCAAAG CAGAATTCGA 

AGAGTATAAG AAAAACGGTA TTAAGCATCA 

ATTCGAAAAG AATTCGACGA GAACGCGTGC 
50 TGCGCATCCA GAATTTTTAG GAAAAAATGA 

GGATACTGCG AAAGTATTAG GTAGAATGTT 

ACAAGCTGTT GAAGATTTAG CGAAGTTCTC 

55 



ATGTATC CGA ATGATTTATT 
GCGmGTTGGT ACgTGGACGT 
ACATTTAACG TAATAAATTA 
GTATGATTTA AAAGGCAGAT 
AGGACTTATT GATTTTGCAA 
CTACTTATCT GGAAAAAATA 
TGCGTTTACA GTTGCGTCTA 
T ATT CAATT A GGCAAAAAAG 
CGATGGTATT GAATTCCGTG 
TGGTGTACCG GTGTGGAATG 



GAGACCAAAA 60 

GGGGGCCCTA 120 

TAGAmACGAG 180 

CATTATTAAA 24 0 

TTACATTAAA 30 0 

TTGCACTACT 36 0 

TTGATTTAGG 420 

AATCTGTAGA 480 

GTTTTTCACA 54 0 

GATTAACAGA 600 
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TCTAGAAGGA ATAAACTTAA CTTACGTTGG AGATGGACGT AATAATATTG CGCATTCATT 720 

AATGGTAGCA GGTG CTATGT TAGGTGTTAA TGTAAGAATT TGTACACCTA AATCATTAAA 780 

TCCAAAAGAG GCATATGTTG ATATTGcAAA rGAAAAaGCG AGTCAaTATG GTGGTyCAGT 840 

CATGATTACG GATAATATTG CAGArcCAGT TGAAAaTwCm GATGCTATAT ATmCAGATGT 900 

TTGGGTATCG ATGGGTGAAG AAAGTGAATT TGAACAcGTA TTAATTTATT AAAAGACTAT 960 

CAAGTGAATC AACAGATGTT TGATTTAACA GGTAAAGATT CAACGATATT CTTACATTGT 1020 

TTACCAGCAT TCCATGATAC AAATACACTT TATGGACAAG AAATTTATGA AAAATATGGA 1080 

15 TTAGCTGAAA TGGAAGTTAC AGACCAAATC TTTAGAAGTG AACATTCAAA AGTGTTTGAT 1140 

CAAGCTGAAA ATAGAATGCA TACAATTAAG GCAGTAATGG CAGCAACATT GGGGAGTTAA 1200 

TCACTAAATG GAACGATATG AATATGATGT GTCTGATGAT ATAAGTGTCA TGTACAGACA . 1260 

CCTCATATTG GTATTAAAGG AGAAATGAAT ATGAACGAAT CAGGAGATAA CAAACTCAGT 1320 

AAATCTTCTT TAATTGGACT AGTTATAGGA TCCATGATTG GTGGCGGTGC GTTCAATATA 13 80 

ATGTCTGATA TGGGCGGTAA AGCCGGTGGA TTAGCCATTA TTATTGGTTG GATTATTACA 14 4 0 
GCTATAGGAA TGATTTCATT AGCGTTCGTA TTTCAAAATT TAACCAATGA ACGGCCGGAG , 1500- 

CTAGACGGTG GTATTTATAG TTATGmTCAA GCAGGATTTG GGGATTTTGT AGGATTTATC 1560 

AGTGmTTGGG GATATTGGTT CTCAGCGTTT TTAGGCAATG TTGCCTATGC AACACTATTG 1620 

ATGTGAGCAG TAGGTAACTT TTTCCCGATT TTTAAAGGAG GCAACACATT ACCAAGTGTT 168 0 

ATTGTCGCCT CGTTACTACT CTGGGGTGTC CATTTCTTGA TTTTAAAAGG CGTTGAAACA 1740 

35 GCAGCATTTA TCAATAGTAT TGTTACTGTT GCAAAGTTAA TACCGATTTT ACTTGTAATC 18 00 

ATATGCATGA TAATTGCATT CAATTTTGAC ACTTTTAAAA CAGGCTTTTT CAGTATGACG 1860: 

TCAGAGGGTG TATTGCCATT TAGTTGGGCG AGCACAATGA GCCaaGTtAA AAGTACGrTG 1920 

40 

CTAGTGACAG TTTGGGTGTT TATCGGTATC GAAGGTGCAG TAATTTTTTC TAGTAGAGCT 198 0 

nAAAATGAGA AAGATGTAGG TAGTGCCACG GTTATAGGAC TTATATCAGT TTTAATTATC 204 0 

4S TATyTCTTAT TAACTGTATT AGCTCAAGGC GTGATTTTGC AAAATCATAT TTCGCAATTA 2100 

GATTCGCCAA GTATGGCACA GGTGCTTGCA ACTATTGTAG GTGGTTGGGG AT CTACACTT 2160 

GTAAATATTG GTTTAATTAT TTCGGTACTA GGTG CATGGT TAGGATGGAC ACTGCTTGCT 2220 

SO GGTGAATTAC CTTTCATTGT TGCAAAAGAT GGATTATTTC CAAAATGGTT TGCTAAAGAA 2280 

AATAAAAATG GAGGACCTGT AAATGCACTG CTTATTACCA ATATATTAGT ACAATTATTT 234 0 

TTAATAAGTA TGCTATTTAC ACAGAGTGCG TATCAATTTG CATTTTCACT AGCATCAAGT 24 00 
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p a & pt a pt a a 


apaatggapg 

nv»nn 1 \j\jnv»\j 


ATTGGTATCA 

All v7v X n X *—r\ 


TAGCCTCAAT 


TTATGCTATA 


252 0 




aTfiranr agg 


T A TP & ATTA P 

X n X \~t\t~% X X 


TTATTATTGA 


CGATGTTACT 


TTATATTCCA 


2580 


r r i ^ J 1 ^ f *' i " i 7 ; 


T'I'T'fiTfiPa AT* 
X X iAlnLOiil 


rnlf TPm A A Art 


yATwATPAGa 


CACGTTTGAT 


TAAATCAGrC 


'264 0 


inlnl iLtil 


1 Inlufti ArlX 


1- ATPGTAPTT 
l_n X \— vj x nv« X X 


GPAGTTATCG 


GG TTAATTAA 


GTTATTGATG 


2700 




Alljl X 1 X A An 


A AAfVSAGPGA 
nnnvrvjnVj V-wn 


PA A A A AT ATG 

\_nnnnn X n X \J 


AAAGAGAAAA 


TTGTCATTGC 


2760 


Ax 1AGG CkiG X 


X V» 1- w\ 1 




AflPAAPAGPT 


GA AGPAPAAP 


AAACAGCTAT 


2820 


TAGATGTGCG 


ATr'/^A 71 74 7i "O/^ 

ATG CAAAAC 


TiAAALv, 111 


IVPTTft ATTP A 
Ail Ivnl AV~n 




TTGTCATTT C 


2880 


ACATGGTAAT 


r"P^(^ TV TV 7V 7V 

GGTCCACAAA 


TTGGAAGTTT 


TA'PTTA. iTPPfi A 


P A AGPT A A AT 


PG AAPAGTGA 


2940 


CACAACGCCG 


GGAATGCGAT 


m/~vf^ 7\ rri tA /'Ml "l v"» 

TGGATACTTG 


X\j<j 1 (jUui i\J 


TP A P Af^/T/TT A 


TGATAGGPTA 


3000 


TTGGTTGGAA 


ACTGAAATCA 


ATCGCATTTT 


* » t **r*r* 9\ 9\ & tv*' 
AACTGAAAivj 


AAxAb 1 vjAX A 


rSAAPTTITAGG 


3 060 


CACAATCGTT 


ACACGTGTGG 


AAGTAGATAA 


AGATGATCUA 


VJviAx X XiaA.Xa 


APPPA APTA A 




AcCAaTTGGT 


Ccrrrn ATA 


CGAAAGAAGA 


AGTTGAAGAA 


« i nil jv /-I A H AMIP 


11 A P a f*2PP Al^l A 


1 O A 
JlOU 


CTCAGTCTTT 


aAAGAAGATG 


CAGGACGTGG 


TTATAGAAAA 


GXAGX IVjCV^X 


PAPPAPTAPP 


J ** u 


TCaATCTATA 


CTAGAACACC 


AGTTAATTCG 


AACTTTAGCA 




ATATT^TTPAT 
AXnX X\J X V-n X 


J J U w 


TGCATGCGGT 


GGTGGCGGTA 


TTCCAGTTAT 


AAAAAAAGAA 


AATACCTATG 


AAGGTGTTGA 


3360 


AGCGGTTATA 


GATAAAGATT 


TTGCTAGTGA 


GAAATTAGCA 


ACGCTGATTG 


AAGCAGATAC 


3420 


CTTAATGATT 


CTTAGGAATG 


TAGAAAATGT 


ATTTATTAAC 


TTTAATGAAC 


CTAATCAACA 


3480 


ACAAATCGAT 


GATATTGATG 


TAGCAACACT 


GAAAAAAtAC 


GCGGCACAAG 


GTAAGTTTGT 


3540 


GGAAGGATCG 


tGTTGCCAAA 


AATAGAAGCT 


GCGtACgtTT 


GTTGAaAGtG 


GGGaAACCAA 


3600 



A 3601 
(2) ^INFORMATION FOR SEQ ID NO: 7: 

40 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNES S : double 
45 (D) TOPOLOGY: linear 



" (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

CGACACTATT AAATGAATTA GAGCACAATC TAACAAATCA AATTCATTTT TCAAAAGATG 6 0 

AACGACTCAC ACATATCGCT TTAAAGTTAT TCGAAACAAC CGATCCTGTT TCAACAAAGC 12 0 

AACTTGCGCA AGATGTTAAT GTTTCGCGTC GGACAATTGC AGATGATATT AAAATGATTC 180 
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TTATTGGTGA GGAAGATCAT TATCGTAAAG CGTATGCACA CTTTATACAT CAATATATGA 300 

AACAAGCTGC ACCTTTTATA GAGGCGGATA TCTTTAATTC AGAATCAATC GCATTGGTTC 360 

GCCGTGCCAT TATTAAGACA TTAAATAGTG AAAATTATCA TTTAGTTCAG TCGGCTATCG 420 

ATGGCTTAAT CTATCATATA CTCATTGCCA TTCAGCGTTT AAATGAAAAT TTTTCGTTCG 4 80 

ATATACCTAT CAATGAAATT G AT AAATGGC GACATACTAA TCAGTATGCn ATTGCTTCAA 54 0 

AAATGATAGA AAACTTAGAA CGCAGTGTAA TGT 573 
(2) INFORMATION FOR SEQ ID NO: 8: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1221 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY; linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

TTGATATTTA TAACGTTATA TTTTAATAGT TCACCTGGAT TATTAAATAA ATAGTCCGCC 60 

AAATTTTCTT TTTCTTTATC AATCTGaTkG TAATTAACaC TTTCGaCTTC TGTAGGAATT 120 

CTAATGTCAA CAGAAGCATT GATATAAGCT TGATGTTGCA TGCAATCACA CTCCTAATCC 180. 

TTCATmTmAA ACGGAGAAGT AAACCCGTCA CTATTCAAAT TCAATCCTTT TGCCCAATCA 240 

ACAGGCTTAT TCATGATAGT TTCGATTTCC TTAAGTCCAT TTGAACCTCT AGGTATTTCT 3 00 

ACAATTACTT CATCATGGAC ATGGCCAACT ATTTTAAAAC CTAATGCTTC AAGCCTTGCT 360 

ATAGAAATCG CAAGTAAATC CCTTGCAGTT GCTTGAACAA TATTCTCGAC TAACTTCCCA 420 , 

CCATACGTTT TTAACTTTGA CCATTTACGG TTAAGATCTA ACCCCATAAA TTCAACAACT 4 80 

TGACTACCCC AACTATTTTC ACCAACTAAA GCTTTTGGAT AAGCTAAAGC TCTTCCACTA 540 

GGCAGTTCAA TCATTAGAAA ACCTTTTTTC ATATAAAATC TAAGTCCATG TGTATGATGC 600 

GTCTTTCGGG ATTTTACAGT ATTAATTGCA GCCTCTTGGC AAGCCTTCCA AAAATTAACT 660 

ATGTTAGGAT TTGCGTTACG CCAACTATCA ACTAAACCTT GTAACTCGTT TTCTTCAATG 720 

CCCATTTCCA ATGCACCCAT TG CTTTT AAA GCTCCAGCGC CACCTTGATA GCCTAAAGCT 780 

AATTCGGACA CTTTTCCTTT TTGTCTGAGA GGGTCGCCTT TAGTTATGCT TTCTACCGGT 84 0 

SO ACATTAAACA TTTGAGAAGC CGATGCTTCA TATATCTTTC CGTGTGTGTT GAATACATCT 900 . 

AAACGCCATT GTTCTTTTGC ATAC CATGCT ATGACTCTTG CCTCTATTGC AGAAAAATCA 960 

CTTACTGCTA GTTCATTACC TTCTTCAGCA GTAAATGTCG TCCTAACTAA TTGACTTAAT 1020 
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AGATCTCTTG CTATTTCTAA TTCAGTATCT GAAATATAAT GCTTTGTTAA ATTCTGAAGT 1140 

TGTACACCTC TACCTGCCCA TCTTCCAGTA CCGGCACCGT AAAATTGAAA CAGACCTCTT 1200 
ACCCGTTCAT CACTGCACAT C ' 1221 
(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1090 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : double 
(D) TOPOLOGY: linear 

Cxi) SEQUENCE DESCRIPTION : SEQ ID NO: 9: 

TTTTGTTTGG TATGAGGTAG CAATGACGAC GTGTCATTGG TGGAGATTGT AAAAATACAT 60 

AATAAAAAGA AGCGGCAATG TATACCGCTC CTTTTTTATA CTACATACCG ATTTTCAACC 120 

ATCTCTTTCT ACTTAGTAAT AAGACAATAG TATTAACTAT AAATAGAAGA ACGAAGAATG 180 

ATACTATATT TATAATTTCA GTAGGACACA TAAATGTTGA CTCGTTATTC AATATTTTTT 240 

CTACGGCACG ATACATCGTA TTGCTCGCCT CAAATGGAGC AACGATACCA AATATATTTT 3 00 

TATTAATGGC AACTAAGATG ACTGAACCAA TCCAATATAC AATGCTGATA CCTAAGCTGA 360 

TTAAAATGTT AGGTGAAACC ATACTAATCG TTCCAACAAC TAAGATATAT TGTAAGATAA 420 

CGAGTGAAAA TAAGATTATT AATAGTAAGT AATGTGAGAA ATCCGAATAT ATAATTGAAA 4 80 

TAATAGTGAT ACTTAGAATT ATGAACACTA AACATTCAAA AAATAACACT GCTACCTTTT 540 

TATAGAAGAA GGTAAAGATA TTATCGCCAA TCAATTTATA AAACAGGATA TTITTATTCG 600 

AATACTCTTT ATTAATAAAA TATGCAATAA CAAATGAAAA TAGTAAGAAC CCTAATTGCG 660 

TTGCAACAGT ATATGAACTG AAGAAAAACT GGCTATAGCT TAAACTTTTA ACTTTGTCTA 720 

TACCTATTGG T AAAAAAT AC CCAAGTAAGA AAAGGAATGT GAATAGCACA ACAAGCGTGT 730 

AAATAATTTT ATTGGAAATA CTTTTTTTAA ATTCTAATTT CAAAGTGGAC ACCTCAATTA - 840 

TAAATTAATG TAATCATTTA TGACTTCTTC TTTTGATTGG TACTCTTCTA TTTGAAGGTC 900 

TTTAAAAATA AAGTATTTAC CCGGCAAAGC ACTTAAATCG GATAAATTaT GTGTAATATT 960 

GATAATAGTT TTAGTTTGAT GGCTTTGAAT AAAATCATTT AAAAATTCAT AAATTTCATT 1020 

AACTGTTTTC TTGTCTAAAG CGTTTGTAAC TTCATCTAAT ATGATTAAAT CATGATCTTC 1080 

CAATAAGAAA 1090 
(2) INFORMATION FOR SEQ ID NO: 10: 
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(A) LENGTH: 904 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNE S S : double 

(D) TOPOLOGY: linear 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
10 TTAGGACTAT TTTATCATAT TCATTTAAAT TACGG CTAAA AATTTTAAAA ACGGGGATTA 60 
AT ATATGGAA TTAAGCTATG AAAGTTAATT GATACTTGCA TTTTACGCTG ATTTATATAA 12 0 

GAATAACTAT TGTATAGTTT TAAAAACGAA CGTACGTTTG CAGGAGGCGA AATCATTGGC 18 0 

AATGAATAAA CAAAATAATT ATT CAGATGA TTCAATACAG GTTTTAGAGG GGTTAGAAGC 24 0 

AGTTCGTAAA AGACCTGGTA TGTATATTGG ATCAACTGAT AAACGGGGAT TACATCATCT 3 00 

AGTATATGAA ATTGTCGATA ACTCCGTCGA TGAAGTATTG AATGGTTACG GTAACGAAAT 360 
AGATGTAACA ATTAATAAAG ATGGTAGTAT TTCTATAGAA GATAATGGAC GTGGTATGCC 420 
AACAGGTATA CATAAATCAG GTAAACCGAC AGTCGAAGTT ATCTTTACTG TTTTACATGC 480 
25 AGGAGGTAAA TTTGGACAAG GCGGCTATAA AACTTCAGGT GGTCTTCACG GTGTTGGTGC 54 0 

TTCAGTTGTA AATGCATTGA GTGAATGGCT TGAAGTTGAA ATCCATCGAG ATGGTAATAT 600 
ATATCATCAA AGTTTTAAAA ACGGTGGTTC GCCATCTTCT GGTTTAGTGA AAAAAGGTAA 660 
30 AACTAAGAAA ACAGGTACCA AAGTAACATT TAAACCTGAT GACACAATTT TTAAAGCATC 720 
TACATCATTT AATTTTGATG TTTTAAGTGA ACGACTACAA GAGTCTGCGT TCTTATTGAA 780 
AAATTTAAAA ATAACGCTTA ATGATTTACG CnwGGgTAAA GAGCGTCAAG AGCATTACCA 84 0 

35 

TTATGAAGAA GGGAt CaAAG rGTTgTTAGT atGTCCAaTG ArGGAAAAGA AGTTTTGCCT 900 
GACG 904 
<2> INFORMATION FOR SEQ ID NO: 11: 

40 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11271 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNE SS : double 
45 (D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

GATTTCTAAA TCAAGATCTG TTTTACGATA ACCATTCAAA CCTTGACGTT CATCTTCTTC 60 

AGGTTGATTT TGTTGCTGTG TGTCTTTGTT GTCAGAAGTC GCTACTGTTT TTTTATTATC 120 

TGTTTCTTTA GTCATAACAA ACGCCTCCGT TATAAAACGC TATATTTAAT GATATGTGAT 180 



243 



TTAATAAGAC GATTCAGCAA GTTTTAAAGT 

GTCCTATAAT ATCACTGACA TTGTCAAAAT 

5 - . 

TGTTCATTTT CTTAGTTAAA CCTGGGCCTT 

GTGACGCACC GTGACGCATC ATTTTGATTG 

CTATAATTAA AAATTCACCA TTTGTTTGCT 

10 

GTTCAAATAA TGGACGACCA CTCAAACCGC 
CATCTCGTTG TCGCGTTGTG TTTGCTAAGA 

r5 GTAATAGTGC TTTTAAGCCA TCGAAATCCA 
. CTGTTACATC ATGTTGTTTT TTAAATGCTG 
CTTTATCATG GAAGTTTTGA AGATTTTCAG 

20 ATGAAACGTC GTGTTTAAAC GTATCAATAA 
AAGGTGTCAT TTTATTCACA CCAACATTGA 
GCAAATGACT TAGTGCTTTG TTCATACCAA 

25 

CGTCATCTTC TAATAATCTA AACATGCGTG 
TGATACCACC TAATTCTAAA GCACCGAATC 
AAGATTTGTC GAAACCAGCT GCTAAgCCAA 

30 

TTTGTGATAA CGTTGGATTC TTATAAGTAA 
GaAACTTTTG TaACGTTTTT AATGCATCGA 

35 TTTTGAATAA GAAAGGTTTA ATTAATTTGT 
GAGGCTTACT ATCCTCAACT TAATATATGT 
CCAT&CATAA TTTCCTAGTT AAAACTAAAA 

40 CGTTTTTAAG ATTAAATCAT CCTAATTAGG 
AAGGTGTTTG TATGAATGAA CAATGGTTAG 
TTTCACCAGT GAGTGGTGGT GATGTAAACG 
CATTTTTCTT ACTTGTCCAA CGTGGACGTA 
GTTTAAATGA ATTTGAACGT GCAGGTATCA 

60 TTAACGGTGA TGCGTATTTA GTGATGACGT 
GC CAATTAGG GCAACTCGTA GCTCAATTAC 
GCTTCTCATT ACCTTATGAA GGTGGCGATA 
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ATTATTTGAC 


TATGTTGGAT 


TAGGCATCTA 


. 300 


GATGATCTTT 


TAAGTAACGT 


GCGATGCCTT 


360 


CAATAACAAG 


TGATGAATAA 


ATTTGAATAA 


420 


CATCTTCAGT 


ACTGAATACG 


CCGCCTGTAG 


480 


GATAAgCATa 


CTTAATCAAT 


TTTAAATTAC 


540 


CTT CTTCGAC 


TTTATTAGCA 


GAAGTTAAAC 


600 


TGATACCGTC 


AAATGTCTCA 


GTAATCGCTG 


660 


TATCAGACGT 


TAGTTTTAAA 


TAAATTGGCA 


720 


TTAAAGCTTG 


GCATAACATT 


GAAAATTCAT 


780 


TATTTGGAGA 


ACTGATGTTG 


ACTGTGAAAA 


84 0 


p CTTTAT AT A 


ATCTTGATAA 


CGCGCTTCAT 


900 


TACCAACAGG 


TACTTGATAA 


GCATTTTTAC 


960 


TATTATTGAA 


GCCCATTCGA 


TTTATCAAGG . 


1020 


GTTGAGGGTT 


ACCCGGTTGA 


GGTTTAGGTG 


1080 


CAAGGTGTTC 


CAATGCTTTT 


GGTACTTCGC 


.1140. 


TTGGATTGTC 


GTACGTATXA 


CCTTGTATCG 


1200 


ATAGTTTATC 


GACGACTGGG 


AATAAAACCG 


1260 


TAGTTAGTCC 


gtgtgctttt 


• TCGGGTTCGA 


. 1320 


ACATGAGTAT 


GCTCCTATTT. 


CATTATATTT 


■ 1380 


GAAATATATT 


CTTTTAATAG 


ACTAGCATTT 


1440 


AGTTTTGAAA ATTGACGCAA gTTTGAATAA 


1500 


CAATATTATA 


GTATAAAGTA 


AGTAGATTGG 


1560 


AGCATTTACC 


TTTAAAAGAT 


ATTAAAGAGA 


1620 


AAGCATATCG 


AGTCGAAACA 


GATACGGATA 


1680 


AAGAATCATT 


TTATGCTGCA 


GAAATTGCAG 


1740 


CGGCACCTAG 


AGTAATTGCA 


AGTGGCGAGG 


1800 


ATTTAGAAGA 


AGGGGCTTCA 


GGGAGTCAAC 


1360 


ACAGTCAGCA 


ACAAGAAGAA 


GGCAAATTTG 


1920 


TTTCTTTTGA 


TAATCATTGG 


CAAGACGATT 


1980 
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GGCTATGGGA TGCCAACGAT ATCAAAGTAT ATGACAAAGT GCGACGTCAA ATTGTGGCGG 2100 

AATTAGAAAA GCATCAAAGT AAACCGTCTT TATTACATGG TGACCTATGG GGTGGTAATT 2160 

5 

ATATGTTGTT ACAAGATGGT CGTCCGGCGT TATTTGATCC AGCGCCATTA TATGGTGACA 2220 

GAGAATTCGA TATCGGTATT ACAACGGTAT TTGGTGGTTT TACGAGCGAA TTTTATGATG 2280 

CGTATAATAA ACATTATCCA CTCGCAAAAG GTGCATCCTA TAGACTTGAA TTTTAT CGTT 234 0 

10 

TATATTTATT GATGGTCCAT TTATTGAAAT TTGGTGAGAT GTACCGTGAT AGTGTTGCGC 2400 

ATTCTATGGA TAAGATTTTA CAAGATACAA CAAGTTAGTT AAGACGTTAG ATTGAGATAA 2460 

15 ATAGATAATA TGCACAGATA TTTTTACAAT GAGAAGCGAT ACAGCTGCCT CAATAAAAAT 2520 

ATTTGTGCGT TTTTATTGTT GGAAAATAAA ATTTTAATCG CTATTGTTAA TTTCTGTAAT 2580 

GTAAAACAAG GTTGAGTTAC AATAAAAGTG ATTTTATAAC TTTTTGTTCA ATAAAATTCT 2640 

20 AGGAATGATA CATATTTATT GATACAATAA TTTTGAATAT AATCATAAAA CAATATTTAA 2700 

GTATAATTGA ATGTTTGAAT AT CAT AT ATT GATACAGTTT CTAATAATTT TAAAATAATT 2760 

TAAATGGAGA GAGGTGTAAA TGATGAGTAC AGTTCAAAGT GATATTTTTA AGACCAATAG 2820 

25 _ 

TGCATCATCA TCTATTAAAA GCGCTGTTGA AACATGTAAT AATGTGTCGA AACCGGATAA 2 8 BO 

AGATGAAAGT ACAACAGTAA GTGGAAATAA TAATGCTCAT AGTGTGATAG ATGATTTGAT 294 0 

GAGTAAGAAT CAATCTGTTG CTGAAGCAAT ACGAACTGCG AGCGATAATA TACAAAAAGT 3000 

30 

TGGTGAGGCT TTTGACCAAA CTGACGTAAT GATTGGTAAT GAAATTGGTA AAAATTAAAA 306 0 

CGTGGTGAAA TGATGTCGAA TAAACTGGAT GAAATCAATA AAATAATCAC AGCGAAACAT 3120 

35 GAGCAAATGG ATGACTTATA TGATGAAAAG CGAGAGGTTA AAGCATTGAT AGATGAAAGT 31B0 

GATGCGCTTA ATCATTCGAT AGAT CAATTA TATCAACATT TAGGTGAGCG TTATTATAGT 324 0 

AGCAATATGG CTAGTCGTAT GGAACAGTTC CGGGATGAAT TTCATTTTGC GAAACGACGT 330 0 

40 

TCAACGGAAG CGTTATACGA GCAGCAACAG CAAATTCAAC ATGGCATTCG TAAAGTGGAA 3360 

GAAGAGATGA TTGACTTGGA AATGCGAAGG AATGTTGAAA TTGAGACGGT GACAAAGGAG 342 0 

GAAAATAAAT GGAAACAATA GGAAGCATTA TTTATTTAAA AGAAGGTTCG CAAAAGTTAA 348 0 

45 

TGATTATTAA TAGAGGmCCA aTTGTAGAAA TTGAAAATCA AAAGTATATG TTTGACTATT 354 0 

CTGCATGTAA AT AT CCG ATT GGTGTTGTAG AAGATGAAAT TTATTATTTT AACGAGGAAA 3600 

so ATATAGATTC AGTTATTTTT AAAGGTTATT CTGATCAAGA TGAGGTTAGA TTTCAAGAGT 366 0 

TGTTTGAAAA TATGAAACAA AATTTGGATA GTGAAATACA ACGTGGAGAA GTTACACAAC 3 720 

AATAAAGAAA TACTTTTTCT TTATTGGGGT GGGACGACGA AATAAATTTT GTAAAAATAT 3780 

55 
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ATGTCATTCA TAATCATTTG AACTAAACGT 
ACCAACTTCC GAAATGTAGA TGAATTCTCT 

5 

AAATTTCTCA AGGATAGGTC TATACTTTAT 
AACAATAACT TGAAATAGAT CATTGAGGGA 
TTATcGGTGC CGGTGCTGCA GGTATAGGTA 

10 

CAGATGTCAT TATTTTAGAA AAAGGAACAG 
CGACCCGTAC GATCACGCCA TCATTTACGT 

r5 CAATTTCCAT GGATACTTCA CCAGCATTTA 

CATATGCTGA ATATTTACAA GTGGTTGCCA 
CAGTTGTCAC AAATATATCT GTAGATGATG 

20 TATATCACGC GGATTATATC TTTGTCGCAA 

TTAAATATGG TATTCATTAT AGTGAAATTG 
ATGTGGTTAT CGGAGGTAAT GAAAGTGGCT 

25 

GCTCTGACAT CGCACTTTAT ACTAGCACAA 
- GTGTTAGATT GTCACCTTAT ACACGTCAGC 
GCATCGAAAT GAATGTACAT TATACAGTTA 

30 

ATATCAGTTT TGATAGCGGA CAAAGTGTGC 
- : GCTTTGATGC AACAAAAAAT . CCAATCGTTC 

35 TTAAATTAAC AACACATGAT GAATCGACAC 

CAGTTGAAAA TGATAATGCC AAATTATGCT 
TACTTGCACA TCTTTTAACA CAGCGGGAAG 

40 ATTATCAAAA AAATCAAATG TATTTAGATG 

GTTAGAAGTG AAATATGATA TGAGAACTGG 
TTATTTGGTT ATTAGTCATG CGGATAAACT 

45 

ATTAATCATA AAACAGAAAT TAGATATTTC 
AGCGAGTGAA CATGTGATAG AACAATTGAC 
so ACCTAAAATA AGTGCGACAT TTTTAGCCTG 

AATCGGTATC GCTTATCAAT TTTCAGATTG 
AGAATATTTA ACTCAAACAA CATTGCTCAA 

55 



AG CAGCCTT A 


AATTTTAAAA 


AAAGACACAT 


3900 


ACAATAACGG 


AAGTTTTTCT 


TTTAATATTG 


3960 


AAATCGTAAT 


TATTAGGATT 


TATAATCAAA 


4 020 


GTGTTAATAT 


GCAACATCAT 


AAAGTGGCTA 


4080 


TGGCCATTAC 


CTTAAAAGAT 


TTCGGTATAA 


4140 


TAGGACATTC 


ATTTAAACAT 


TGGCCGAAAT 


4200 


CTAATGGATT • 


TGGCATGCCT 


GATATGAATG 


4260 


CATTTAATGA 


AGAACATATT 


TCCGGAGAAA 


4320 


ACCATTACGA 


GCTGAATATC 


TTTGAAAATA. 


4380 


CATATTATAC 


GATTGCAACG 


ACAACAGAGA 


4440 


CAGGTGATTA TAATTTCCCT AAAAAgCCAT 


4500 


AAGACTTTGA 


TAACTTTAAT 


AAGGGGCaAT 


4560 


TTGATGCTGC 


ATATCAACTT 


GCAAAAAATG 


4620 


CCGGl ITAAA 


TGATC IjCiA X 




4680 


GACTAGGTAA 


TGTCATTAAG 


CAAGGTGCTC 


4740 


AAGATATTGA 


TTTTAACAAT 


GGACAGTATC 


4800 


TTACACCTCA 


TG AAC CAAT A 


CTAGCAACTG 


4860 


AACAATTATT 


TGTGACAACA 


AATCAAGATA . 


4920 


GTTATGCGAA 


TATTTTTATG 


ATTGGTGCAA 


4980 


ATATCTATAA 


ATTTAGAGCG 


CGATTTGCAG 


5040 


GcTTACCAGC 


TAAACAAGAT 


GTCATTGAAA 


5100 


ATTATTCATG 


TTGTGAAGTG 


TCATGCACAT 


5160 


GCATTATACG 


CCCATACCTA 


ATGAAC CTCA 


5220 


TACCGCAACA 


GAAAAAGCGA 


AATTAAGATT 


5280 


ATTGGCAGAA 


AGTGTAGTTT 


CTTcGCCTAT 


5340 


ACT ATTT CAA 


CATGAGCGAC 


GACATTTAAG 


5400 


GTTGTTGATA 


TTTTTAATGT 


TTGCATTGCC 


5460 


GTTTCAAAAT 


CAGTATGTGT 


CAGCATGGAT 


5520 


TCACGATATA 


TTACAGCATA 


TATTATTTGG 


5580 



246 



EP 0 786 519 A2 



10 



20 



ATTGATTAGT TTATCAACTG CTATAATTGA TCAAACAGGA CTCAAATCAT GGATGATATG 5700 

GGCAATTGAA CCGTCAATGT TATGGATAGG ATTACAAGGT AATGATATCG TGCCACTATT 5760 

AGAAGGGTTT GGATGTAATG CAGCAGCTAT TTCACAAGCA GCACACCAAT GCCATACCTG ,5820 

CACGAAGACA CAGTGTATGA GTTTAATAAG CTTTGGTAGT TCTTGTAGTT ATCAAATAGG 5880 

TGCGACATTA TCTATTTTTA GTGTAGCTGG AAAGTCATGG CTATTTATGC CGTACTTAAT 594 0 

ATTAGTACTT TTAGGTGGCA TCTTACATAA AGGATATGGT TGAAAAAGAA TGATCAACAA 6000 

CTTAGCGTTC CGCTACCTTA TGATAGGCAA TTACATATGC CAAATATACG TCAAATGTTG 6060 

15 CTACAAATGT GGCAAAATAT ACAAATGTTT ATCGTTCAAG CGCTACCTAT TTTTATCACA 6120 

ATCTGTCTTA TTGTTAGTAT TTTATCACTA AOGCCAATTT TGAATGTTTT ATCACAAATA 618 0 

TTTACACCTA TATTATCGTT ATTAGGCATC TCGTCAGAAT TGTCACCAGG GATTTTATTT 624 0 

TCAATGATTC GAAAAGACGG CATGCTCTTG TTTAATTTGC ATCAGGGCGC CTTATTACAA 63 00 

GGAATGACAG CAACACAGTT ACTACTACTT GTGTTTTTTA GTTCAACATT TACAGCGTGC 63 6 0 

TCGGTCACAA TGACGATGCT TTTGAAACAT TTAGGTGGTC AGTCAGCACT AAAATTAATT 6420 

GGAAAGCAAA TGGTGACATC ATTGTCTTTA GTTATTGGTG TAGGGATCAT TGTTAAAATA 64 8 0 

GTAATGGTGA TTATTTAAAA AAAATGAACT ATAACTGAAT ATAGAGTCAT GTCAGTCAAT 6 54 0 

AGGAGATCTA TCTTGGAATA TGCTATTCAT ATGAAGTATA AGAGGAGAGT CGCAGATGAA 6600 

AATAGTTATT ATAGGTGGGT TTTTAGGTGG CGGTAAAACG ACTGTCTTAA ATCATTTGCT 66 60 

CGCTGAATGA TTAAAGGAAT CGCTGAAACC AGCAGTCATC ATGAATGAAT TTGGGAAAAT 6 72 0 

GAGTGTTGAT GGTGCCTTAG TATCTGAAGA CATACCTTTA AGTGAACTGA CAGAGGGGTG 6 780 

TATCTGTTGT GCAATGAAAG CAGATGTATC AGAACAGTTA CATCAATTAT ATTTAAAAGA 6 840 

GCAACCAGAC ATTGTATTTA TTGAATGTAG TGGGATTGCA GAACCGGTCT CTGTCTTAGA 6 900 

TGCTTGTTTA ACGCCTATTT TAGGTCCGTT TACAACAATT ACACATATGA TTGGTGTAAT 6 960 

AGACGCAAGC ATGTATAAAC ACATTAAATC ATTCCCTAAA GACATCCAAG GCTTATTTTA 7020 

TGAGCAATTA GCATATTGTT CTGTCTTATT TGTTAATAAA ATAGATTCAG CAGATGTTGA 7 0 80 

AACAACGAGC AAACTATTGA AAGATTTAGA AGTTATTAAC CCAGAGGCCG ATATACAAGT 7140 

CGGTATGCAT GGCAGCGTCA CTTTGCCAAT ATCAGTTAGA CAAATGACAG CAACTTCTGA 7200 

CAATAAACAT AAGTCTTTAC ATCAAATGAT TAATCATCAA TTTGTGCAAT CACCAGTCAA 7260 

ATGTACTAAA GCAGAGTTTA TAAAACGTTT AGCATGGCTT CCGTCTCATA TTTATAGGTT 7320 

GAAAGGGTTT ATGACATTTG AAGACACCGC ACATACGTAT CTCATTCAAT TTACACAAGG 7380 

55 
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CGGAAAGGGT ATTTCAAAAG AAGACTATCA 
GAGAATGGTT AACATGCCTT CATGTATAAT 
5 TAAAAATAAG CTTGGTCAGC CATCAAATAT 

GAGAACAATC AATTAACCCC ACATATTTAA 
AATATAACCT AAGTGACCGC CTGGAATATC 

10 

ATAAAAGTTA ACATCTTGTG GGAAGGAGCC 
TTTATCGCTG TATTTTGTGA AATCATCCAA 

1S TTCAAATTCT GACCAGAACA TCGTACGTTT 
AGCAGGTTGA GACATCATTT TTGCATCAAT 
TTTCATGCCT TTTTCTAAGC CTTCTGTTAA 

20 TTTCCAATAA GTACTGTCTG GTAAAAATGT 
TTTAACGACT TCAGGGTAAT CTTTTAACAC 
ACCTAATATA TAGACAGGTT CATCACTTAA 

25 

GTCGCGTTTG ACACGATAAT CACTGTCAGG 
TAAGTCGCTT TCTCCATAAT CACGACG AT C 
CTGTTCTGCA AGAGGCAGAA AAATGTCTC C 

30 

CACGGGTCCT TGTCCGACTT GGTGGTATCG 
CATATTCAAT GACCTCCATT TGTTAATTGT 

35 TTGTATAACT TATTTTCTCT TTTTCTTCAT 
CTAATTTTTC AGGCTCAATA TATGGATAAT 
CTTCTTTCTT GACTAAATCA AACTGTGGCT 

40 AATTATTTTT AAAGTAATAG CTTACAGGGT 
CCATACGTTC TAAGAAGAAT GGGATAAACT 
TTGGATAACG ATCAAAAATA CCAGATAATA 

45 

TGTGCCAACC ATAACCAAAA CAAGCAAATG 
TATAGTATGA TTGATAAATG TCACTGTTAA 
SQ CTAAATTTTC AGCTGTTTTG AAAATAATGT 
GTGCACGTCC CATAATGAGC GCACCTTTGA 
CTCGCGCTGC GGCTTCAGGC TCATTGATAG 

55 



ATGTTTGGAA CAGTAGTGTT TTCAGTGGAA 7500 

AACGAGTTGA TTTGAACGTT TAAGCGTAAA 7560 

AATTTGAAAA CTGTC CAAGC TGTTTTATTA 7620 

TAATACATCA GCAAAGCCTT CAGGTTTTTG 7680 

TACAATAGGT ATGCCAGTTT CTTTATTTAT 7740 

TCTAGAATCT GTCCCATTTA GTAGGGTGAT 7800 

AGTAATATCT GAATGCGTAT ATTGTCTAAT 7860 

GTACTGTTCT ATACGTCCTT CTTCAGTATC 7920 

TGGTGCGATA TTTAATGTTT CGCCAAATGT 7980 

AATTTGATGC ACAATGTCAT CATTTTTATC 8040 

ATTAATTGGT GGTTCGTGAA ATGCAATCTT 8100 

ATGCATCGCA ACGATTGAAC CTGAACTTGA 8160 

TGACTTTGCA AGTTCGGCAA TGTCCTGTGC 8220 

GTTTGAAGCG GAATCAGGGA GTGGTTCAGT 8280 

AACGGCTACA ACAGTAAAAT GGTCTTTTAA - . 8340 

GGTACCGTTT : GCACCAGGAA TAAAGATGAG 84 00 

TAATTTAGCG CCTTGT AATT CTAAAGTTTC 84 60 

TAGGTGATAA; ACCTAATAAX TTAGCACCAT . 8520 

CTGTTAAACC CAGTTCATCT, AAAAATACAC. , 8580. 

CAGCAGCATA . AAGAATTCTA TCAATACCTA 8640 

TCGTTAACAT GCCAC'TCGGT GTGATATAAA 8700 

GGTTCAAATG TTCAGCGAAT AAAGCTTCAT 8760 

CACCCCAATG TCCAATAATC ATATTTAACT 8 620 

CTAGATGTAT TGTATGAATG CCGACATCAA 8880 

TTGCCGCAGT TACTTCAGGA TAATTTCCTT 894 0 

CTGGCGCGGG ATGTAGATAA ATCGGTACGT 9000 

CATATTTGTC TTGATCAAGA AAACCATCTT 9060 

ATCCTAAATC ATTGATGCAA CGTTCGAATT 9120 

GTAAAGTTGC AAAGCCTACA AAGCGATTGG 9180 
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TCTGACCAAC CAAATTTGAA GGAGAACCAT TTCCATAAGA TAAGACTTGA ATTTGAACGT 9300 

CTTGATTATT CATAAATTGG ATACGTTCAT CATGATGTGA TAATTCGTCG GCATTTGTAA 9360 

AACCTGTCTT TTTTTcAAGG CCTTCTAACA TTACTTTCAT CGGTACACCT TTAGGATCTG 9420 

CTGATATCGC ATTCATCGTT TCTTTTTGAA TATCTTCAAT GACATAATGT TCTTCAAACG 9480 

TAATACTTTT CATTTACTTC GCCTCCATAT TGTATTGCAT GTTTATTGCA TCTATTGCAG 9540 

AAGCATTTTT TATATACCTC TAATTTCAAT GTTTGTAACA TAAAATTGAT CTACCAAGGC 9600 

ATCTCTCCAT CGCCATTAAT AAATGTACCT GTTGGGCCAT CTGCACCAAT CGTTG CTAAT 9660 

15 TGAATGATTG GCTTGATTCC TTCAGAAACG TGTTTGGAAT TATTACTAAA ATCACCAACT 9720 

AAATCAGTA7 TTGTAGCGCC TGGATCAGCA GCATTGATTT GCATGTTAGG TAATCCTTTA 9780 

GCGTATTGTA GCGTTAGCAT TGTTACTGCC GATTTAGACG AACAATAAGC TAATGAATTC 984 0 

ACTTTAGATT CAGCTGTTTC GGGGTTTGTA ACCATTCCAA ATGAACCTAA ACCACTTGAT 9900 

ACGTTGACGA CAACAGGTTG TTCAGATTTT TCTAAGAGAG GGACGAATGT ATTCATCATT 9960 

CGTACGATAC CGAATACATT CGTTTGATAT ACTTCTTCAA CGTCACGAGG TGTCAATTTG 10020 

GAAGGTG CTG AAAATTGACC AGATATACCT GCATTGTTAA ' TGAGGATATC AAGACGGCCT 100 80 

TCTTTTTCAG CAATCATGTT ATAAGCATTT TTGACTGAGT AGTCACTTGT AACATCTAAT 10140 

30 TGTACATAAT GAACACCTAA TTTTTGTGAT GCTTGTTGTC CTCTTACATC ATTCCGAGAA 102 00 

CCTATATAAA CTTTGTAACC CAATGCTTTA AGTGCCTCTG CACTTGCATA GCCTAACCCT 10260 

TTATTGCGTC CTGTGATTAA CACAATTTTA GTCATTACGT CCCACCTCAT CTAAATAAAT 10320 

35 GTTTAATAAA TAATTTCTGT ACGCTTCAAT TGAAATATGG CGATGCTCTA TTTGGAAGGC 10380 

AAATACACTA GTTGATAATG ATTGCAACAG CATATCTGTT TTGAAtTCGT GTAAGTGTCG 10440 

TCATCGCTTT TAAATAAGTC ATAATAAAAA TCAAATAATT CTTGATAAAA TGCG CTTTGG 10500 

40 

TAAAAACGTA ATTTATTGTT GCCTGCTTCA ATACATTGCA GTAGTGCCTT ATTATCGATT 10560 

TTAAATTGTA AAAGATAATC TAACGACACT TGCATAACCT CATAATTAGA ATGATAGTCA 10620 

TCTTTAATTT GCTTAAAATG AGTGATAAAA ATATCAAGGT CTCTTTGTAT GACGTAGTAG 10680 

CATAAATCGC TTTTATCTTT GAAATGTCGA TACAATGTCC CCATACCGAT ACCTAGTTCT 10740 

TTAGCAATAC GATTCATACT AATGTTTTCA ACGCCTTCTT CATCAAAAAG TTTGTGCGCT 10800 

ATTTCTTCAA TTCGTTGCCT ATTCTCTTTT GCATCTTTTC GCATGATTAC ACCTACTTAA 10860 

AATTCtCTAA AATTGACAAA CGGATAACTC TCCGTTTATT ATAAAACGTG TTAAGAAAGT 10920 

TAGCAATGAA TTTGCAATAA CTATTAAATA TCATAAAAGA AAAGAGTGTT GATAATGTCT 10 980 
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ACCTTATCGG TTCAAATGAT TGCTGAAAAA CTGAATGTCA CTACAGAAGA TGTGGAAAAA 11100 

GTATTAGCTA TGACAGCGCC ACTAGGCATT TTTAGTCATC AATTACAACG ATTTATTCAT 11160 

TTAGTATGGG ATGTCAGAGA TGTAATAAAC GACAATATTA AAGGAAATGG ACAAACACCA 11220 

GAACCATATA CGTATTTAAA AGGTGAAAAA GAGGACTATT GGTTTTTAAG A 11271 



(2) INFORMATION FOR SEQ ID NO: 12: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6261 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

1S (D> TOPOLOGY: linear 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 



20 


WVivCUvj X X 


AGAACAAAAT 


AAAAACCGTA 


CAATTTTATC 


ATCTTAATGA 


TTATTGTACG 


60 




UJ*U*U"U"UV»» XXX 


TTTACATCAT 


ATCTGCATGT 


GCATAATCGA 


TATCGGTAAA 


TTTATTATAT 


120 




TRTTTCATAA 


AATGTAACTT 


AACTGTGCCT 


GTTGGACCGT 


TACGTTGCTT 


AGCAATGATA 


180 


25 


ATTTCAATTT 


CACCGTTTTC 


ATCATTCGTT 


TGTGGCTCGA 


AACCACCATC 


ATCGTCATCA 


240 




TCTTCATCGC 


CGCCACGGTT 


ATAGTAAT CA 


TCACtivjIAiA 


i-UjAAX uUHnL 




300 


30 


, TCTTGCTCAA 


TCGAACCAGA 


TTCACGAATA 


TCACTCATCA 


TTGGACGTTT 


ATCTTGTCGT 


360 


TGTTCAACAC 


CACGAGATAA 


CTGACTTAAT 


GCGATAACTG 


GACATTTTAA 


TTCACGGGCT 


420 




AATGCTTTTA 


ATGTACGAGA 


GATTTCAGAA 


ACTTC CTGTT . 


GTCTGTTATC - 


GGACGCACGT 


480 


35 


GAACCACTAC 


CTTGAATCAA 


CTGTAAGTAG 


TCAATCACAA 


TCATGTCTAA 


GCCATGTTCT 


540 




TGCTTTAATC 


GACGACATTT 


AGAACGTAAA 


TCATTAATTC 


GAATACCCGG 


TGTATCATCA 


600 




ATAAAAATCT 


TCGTACGTGA 


TAATTTACCT 


ACCGCTATAG 


TAAAACGACT 


CCAATCTTCC 


660 


40 


TCAGTCATAG 


TACCCGTTCT 


TAAGCGGTTT 


GAGTCAACAT 


TTCCAGAACT 


ACAAATCATA 


720 




CGTGTGGCTA 


ACTGATCAGC 


ACCCATCTCT 


AGCGAGAAAA 


TACCAACTGT 


ATACATATCT 


780 


45 


TCATGCGTTG 


CAACTTTTTG 


TGCAATATTA 


AGTGCGAACG 


CAGTCTTACC 


TACAGATGGA 


840 


CGCGCTGCAA 


GGATAATTAA 


ATCATTTCGG 


TTGAACCCTG 


CTGTCATTTG 


GTCTAAATCT 


900 




CGATATCCTG 


TAGGTATACC 


TGGTGTTTGA 


CCACTATTTT 


GATCAAGCTC 


TTCAGCTGTT 


960 


50 


TCATACACTT 


GTCCTAAGAC 


GTCTCGAATG 


TCTTTAAAGC 


CATCGCTTTC 


ACGAGAAGAT 


1020 




GATAGCTCTA 


AAATTCGACG 


TTCTGCATCA 


CTTAAAATCG 


CATCTAGTTC 


AAGTTCATCA 


1080 




TTATATCCAT 


CATTGGCAAT 


ACTATCTGCA 


GTTTGAATCA 


ATCTACGTTT 


TAATGCATGC 


1140 
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TCTG CAAGAT ATTGCGGGCC ACCCGCTTcA TTCAACGTAC CTTCCGTCGA TAATTGATCC 1260 

ATCAATGTTA CAACATCAAT TTCTTTATTA TCTTCATTTA AGTGCATCAT TGCACGGAAA 1320 

5 

ATATGTTGAT GGGCACCCCT ATAAAACGAC TCAGGAAGCA AAACTTCCTG AGTAGTATTA 13 80 

ATCAATTCTG GATCTATAAT AATTGAACCT AAGACAGACT GTTCAGCTTC ATTGTTATGC 144 0 

GGCATTTGAT TTTGCTCATA CATTCTATCC ATGAATGGTT ACACCTCTTA TTTCAATCCA 1500 

10 

ACTTTATTGT TCAACTGTGT GTACGCGAAT TGTACCTTCA ACTTCTTTAT CTAATTTAAC . 1560 

AGGTACATTC GTATATCCTA GGGAATGAAT TCCATTTGGT AAATCCATTT TACGTTTATC 1620 

15 AATTTTAATA TCATGTTGTG CTTTTAGTGC TTCGGCAATT TGTTTTGTAC TTACTGACCC 1680 

AAACAATTTA CCACCTTCAC CAGTTTTTGC TGaTACTTCA ACTTCAATGT TTGATAACGT 1740 

TTCTTTTAAT GCTTTAgCAT CTTCAATTTC TTGTTGGCGT TCTTGTTTTG CACGTTTTTT 1800 

20 CTGTAACTCT AATTGTTTAA GGTTAGCTGG TGTTGCTTCT ACAGCATAAT TCTTTTTCAA 1860 

TAAGAAGTTA TTTGCATAAC CTACTGGTAC TTCTTTAACT TCACCTTTTT TACCTTTAGG 1920 

TTTACCTTTA ACATCTTGTG TAAAAATTAC TTT CATG CAT CTTCACTCCT ACTTAATTGT 1980 

25 

TCTGTAATTG CTTGTTGTAA TTGTGCTATC GCCTCTTCGA CTGTCACACC TTTAAGTTGT 204 0 

GTTGCGGCAT TGGTTAAATG TCCACCGCCA CCAAGTGCTT CCATTGTTAA CTGGACATTT 2100 

ACTGAACCGA GTGAACGCGC AGATATACCA ATCAGATTAT CTTCACGTCT CGCAACAACA 2160 

30 

TATGATGCTT CAATACCTTC TAAACTTAAC AGTTCATCTG CTGCTTGTGC AACTGTTACT 2220 

GGATGATAAA TTTTATCGTC TGAACCATGC GcAATGGCTA TGCCATTATC TTCAACTTTT 228 0 

35 ACAGTTCGAA TTAATTCAGA TCGATTAATG TAAGTATCCA CATCATCTTT TAAGAAATGT 234 0 " 

TGCGTTAAAA TCGTATCTGC ACCATGTGCA CGTAAATAAC TCGCTGCATC GAATGTTCTT 2400 

GATGCTGTTC GTAATGTAAA GTTTCTTGTA TCTACAATAA TACCTGCATA CATCACTGTT 24 6 0 

40 GATTCAAGAC GTGTTAAACG TTGTTCTGTT GGTTGATATT CCAGTAACTC TGTTACCAAT 2520 

TCAGCTGTCG AACTTGCGTA TGGTTCCATA TATATCAACA ATGGATTAGA GATGAAGCTT 25 80 

TCACCACGTC TATGATGATC GATAACAACT TTACGGTTTG CTTTATTTAA GACATTTTCA 264 0 

4S 

TCTAAAACCA GTTCCGGTTT ATGCGTATCA ACAATCACTA CGGTTGTCTT AGATGTCATC 2700 

ATATCCCAAG CATC A TCTG A TGTAATAAAT CGCTCTCTTA ACTCTGGCTT TTTATCTATT 2760 

so TCGTTCATCA CGCGTCGTAA TGTTGGATCA ATGTCAGTCT CATTTAATAC GATGTATGCT 2 820 

TCTAAATTAT TCATCATTGC AAATCTAGAC ACACCGATTG CTGCACCAAT TGCATCTAAG 28 80 

TCAGGACGTT TATGTCCCAT GATAATGACT TTGTCACCCT GTGCAAGGAT ATCTTTTAAC 2940 
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CCATAGAAAC GCACATT AC C ATTAATACTT TTAATTG CAA CTTGGTCGCC ACCGCGTCCT 3060 
AATGCTAAGT CTAGGCCTGA TTGTGATAAT TCACCTAAGT CGATTAAATT TTCAGTACCT ■ 3120 

TCACCAACAC CGATACTTAA TGTTAATTGG GCACGATAAC CAACACTTTT TTCACGTAAT 3180 
TGACTCAAGA TATCAAATTT AGATTCTTCT AAGTCAGCTA ATATTTTTTG ATTTAAATAG . .3240 
GCTACGAATT GATCGGAACT GTATCTTTTG AAAAATATAT TATACTCAGT TGCCCATCGA ' 33 00 

CTAATGACAC GCGTTACCAT TGAGTTGATT TCCGAACGCT GCGTATCATT CATATTTTGC 3360 

GTAATCTCAT CGTAGTTATC TAAAAATAAT GTCGCAATGA TTGGTTTAGA ATTTTCATAT 34 20 

AGTTCATTTG TTTGTACTTG TTCAGTTATA TCAAAGAAAT AGAGGCAGTG ATCATTCTCA 34 80; 

GAATAACGTA CTTGGAAATG ATACTGATTA TATTCTATTT cAACGGATTT CACTCTATCT 354 0 

AATTGCTTTA AAATGTTTGG AAATACTTCA TTTACAGATT CAGAAATGAC ATTCGCTTCC 3600 

20 ATATGATCTG TCATAAATTG GTTAACCCAT TCGATGTGAT CATTTTCATC TAAAACAATG 3660 

ATACCAATTG GTAAATGTTT GATTGCTTTA TTATTTGTTG TTGAAATTTG AGCACTCAAA 3720 

CCATCTACAT AACTATCCAT TTTCATTAAA GCTTGTCTGA ATAAAATGAT GCTAACAATA 3780 

ATCATCACGA CAAGAACGAT AGATGCAATT AGTGCTATAA GACTATTAAA GATAAACCAT 3 840 
ACACCCATTA AAACAATTGC TGTGATGATC ATGATGACAA ATGGTATTAG. -TAAAGCTTTC . 3 900 

TT AG TGG ACT GCCGATTCAT TATTCCACCT CTATTCACTT TTTAGAATTA . TTTTTCATGA 3 960 

TTCGCTTCAA ATTCAAACTT AAATCGATAA CACCAAGTAG TCCTACAATA TGTGTCGTAG 4 020 

GTGT CAGTAT ;TGTACCGATA ACCAATAGTA AAATCGTTAC . TGCATTCGGC AAAC CTTTCG 4 080 

CTTTACCAAA GAAATGAATA ACACTTAAAC CTTGAATATA. CATTACTAAT GATAACACAA 4140 

35 

GTTGGAAGTT TAAAAGAATG CTCTGGAACA CACTCGGTTG ACCTGTAAAT, AATAAACATA 4200 

TGATAACAAT AATGTATATC CATAATAAAA . TACCGCTCAT TTGCCACGCG AAAAGTGGCT 4260 

40 TAAATACAGG TGTAGCGATT TTAAATTTTC GTAAAATCGG AAATGTAACG ATTAAGTTAA 4 320 

TTAAGACGAT TAAAAATGTA ATGATAATGA TGAAACCTGG TAATTGAACG GTCGCTTGTC 4380 

TAAACCCTTC TTCTAATATT TGGGTCATAT TCGCATCGGC ACCGCTCATC GTAATCGCTT 444 0 

45 CATGTAATGT TTGCTTGAAA GGTTTTACTA TGCTCGCTGA TGGTGGAATC CTTCCGAATG 4 500 

TTTGTAGTAA CATAAAAGCG ATTAATGAAA TTnArCTCAT CGCTACTGTT GTTACGTATA 4 560 

ATATTCTTTC TTTAGACGTT CTTTCTTTGA GCAATTGACC AATAATTAAA CTTGCAATTA 4 620 

50 

AGACTAATAT GATGGCACTT AAAACGAAAG TATTACCTAA AACAGTTGTT ATAATTACTG 4 6 80 

TAATAAGTGC ACTAATCCCG AAAGATTGTA TTGATTTATT CCATAAAACG ATACCTGGTA 4740 
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CAAATACCAA 


CGCAATCGTT 


GCAATTATTG 


TTGCTTTAGG 


TTGTATTTTT 


GAAAACACAT 


4860 


AAGCCACTCC 


CATATTTTTA 


ACTATAGCTA 


TTATTTTAAC 


CTCTTTAATG 


AAAATTAACA 


4920 


ATTTATAGAT 


TGTATGCTTC 


TATTTCATTT 


AATTGAATAA 


TAACTTTCAT 


GTTTTATAAG 


4980 


TAATTAACAT 


ACTCATTTGA 


ATCGCTTTTG 


TGTGCTTTCA 


TTTTCAACAT 


GATTATTTAA 


-5040 


TCCCACTACA 


TAGCAATCAA 


GCTTGATTTA 


GATTTACAAT 


ACATTTCCAC 


TCTCATGTAC 


5100 


TCTAGATGTT 


TTTGAATATG 


ATAACTGTGA 


TTTAGTGGCT 


TCATTCTTTG 


AAAATATATA 


5160 


TTATTACTTA 


CGCTTAAAAT 


GCTTTAAATT 


TAAGAAATGA 


TATAAGTTAG 


GTGCCCAGGT 


5220 


ACTAAAGTTT 


AGTAGGaATC 


CATCATGCCC 


AACATTATCA 


GGGACGAAGA 


AATGACGATG 


5280 


ATATTTAAAA 


CGTTCACCTA 


ATGCACGAAC 


TTGATCATCC 


GGATATAGCA 


AATCATCTAT 


534 0 


GAACCCCATC 


GTTAACACTT 


TTGTTTCTAA 


ATTTTTAAAA 


ACATGCGTTA 


CGTCTGTGCG 


5400 


ACCTCGGTCA 


ATGTTGTGAC 


TATCCAATAC 


AfCTAGCAGT 


GTCAGATAAC 


AATTCAAATC 


5460 


AAAATGTTCT 


TTAAATTTAT 


TACCTTGATG 


TTGTTGGTAT 


GCGACTACTT 


CATCOGGCGT 


5520 


. AAAACGTTCA 


TCATAACTTT 


TTGATGATCG 


ATATGTCAAA 


AAACCTAATT 


GGCGTGCAAT 


5580 


ACTTAGACCT 


TCCTTACCAC 


CAAGATGAAT 


GGCTTGCCTT 


GCAATTTCAT 


TGAAAGCTCT 


5646" 


ACTATAAGAT 


GATGTTCGAC 


TTGTTGCAGC 


AAGGATAATG 


GCTTTATCTA 


CTTCAAACTG 


5700 : 


TTGATTGTAG 


AGTAGTTCCA 


TTGCTTGCAT 


ACCTCCAAGA 


CTTCCGCCTA 


TTAAAATATT 


5760 


AATCTTATCA 


TAACCAAGGG 


CTTGTATACC 


TCGTTCATTC 


GCTCTGACTA 


TATCTCTTAA 


5820 


TGTTAATTTT 


TTAGGAAAAT 


GAGGGfCGTT 


TAAAGGTGAA 


CTTGAACCGA 


AAGGACTACC 


5880* 


AATAACATCA 


AATGTTAAAA 


ATTGATAATC 


GTGAATGGGT 


ATATATC CCC 


CAT CAATAAT 


594 0 


TTCTCGCCAC 


CAACCCGGAT 


AATCATCTGT 


TCCATATGTT 


AAATGATTGC 


CAGTTAATGC 


6000 % 


ATGACAAACT ACAACTAATG GTTGTCCATG ATAACCGACA TGCTCATATC 


TCAAACGCAA 


6060 


GTnATCTATG 


ACTTCCCCAG 


ATTCTGTAAT 


AAATTCCCCT 


AAATTTAAAG 




612 0 


GTAATTTGTC 


ATTGTTCTTT 


CCTCCTTAAA 


CAAAAAAACT 


TCTCACCCTA 


TTGAAAAGTA 


6180 


AGAAGTCTTT 


ATACTTATCA 


TTCGAGTAAC 


TCGTTGGTTT 


TAGCACCGTG 


CTATAAAGTC 


624 0 


GGTTGCTGAA 


GTATCACAGG 


G 








6261 


(2) INFORMATION FOR SEQ ID NO: 13 











(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1222 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
<D) TOPOLOGY: linear 



253 



EP0 786 519 A2 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 



5 



10 



15 



20 



25 



30 



35 



ATGCGATTAA 


CTCTGGAAAT 


ATCTTTTCCA 


TATTTACGTn 


TTAAATTATT 


CAGCAAATTC 


60 


ATACGAGaTT 


CATACTCGTT 


yAACACTTGT 


TCGTCGAATT 


CTGTATTAGC 


CATTT CATCA 


120 


TATAACTCAT 


GTTTTGCATC 


TTCTAAAATG 


TAGTAAAATT 


GATCAATATC 


TTCTTTTAAT 


180 


TTGTCATATT 


TGTTTGGAAC 


TATATCGTTT 


ATTGTTAACA 


AATGGTTGCT 


TAGTTCATAT 


240 


AAACGATCAG 


TGATAGCATT 


TTCATCCGTT 


AATGTCATAT 


ATGCGTTATT 


AAGCGCTAAG 


300 


CTTAATTTTT 


CAGAGTTTTG 


AATGCGTTTA 


ATATCTATTT 


CAAGTTGCTC 


TATTTCGCCT 


360 


TCTTTTAGAT 


GTGCTTCAGA 


CAATTCTTCT 


AATTGGAATT 


TCATTAAATC 


TAAACGCTGT 


420 


AGCAATGCTT 


GGTCTGCTGA 


TTCTAAATCT 


TCTAACTCTT 


GCTTTTTGGC 


TTTATAATTT 


480 


TGAAAAGTTT 


GGTGATATTT 


ATCCAACAAA 


TCTTGATAAC 


GTGATTCTGC 


GTAATTATCC 


540 


AATAATGTTA 


AATGGTATTT 


TTGTTTCAAC 


AAAGACTGCG 


TTTCATGTTG 


GCCATGAATA 


600 


TCTAATAATT 


CTTGCATAAC 


TTTTCGTAAA 


TCTTGTAAAG 


TAACTGTTTG 


ATTATTAATT 


- 660 


TTACAAAGAC 


TTTTACCAGA 


GCTGAAAATT. 


TCCCGTTTAA 


CTAATAAAAA 


ATCTTCATCT 


. 720 


ACATCAATAT 


CCATATTTTT 


CAATATATGT 


ATAGCATCTT 


TACTCTCGTC 


AATATCAAAT . 


780 


ATACCTTCGA 


TGACAGCCTT 


TTTTT CACCA 


TGTCTTACAA 


AATCAGATGA 


- AGCTCTCATT 


. 840. 


CCAATTAATT 


GTCCAATTGC 


ATCTATAATA 


ATTGACTTAC 


CTGAACC CGT 


TTCACCACTT 


900 


AAAACAGTTA 


TV TV TV mp7\ /-» * 

AAC CATCAGA 


AAA1 IVjAAI 1 


" r P/" , T" , 7\ 7A*i w T^f ~ I ~ 1 ' 
l^iAAl 11.1 1 


CAATAATAGC 


AAATTGCTTG 


960 


ATTGATAAGG 


TTTGTAACAT 


AAACTCATCG 


CATCCTTATA 


ACAAATTGAA. AATTCTTGAC 


1020 


TTGATTTCAT 


CACTTGCCTC 


TTTGCTTCGA 


CAAATAATTA 


AACAAGTATC 


ATCACCACAA 


1080 


ATTGTG CCTA 


GTACTTCTTC 


CCAATTGATT 


TGGTCTAATA 


TAGCTCCAAT 


AGATTGTGCA 


1140 


TTACeAGGTA 


TGTTTTTAGA 


ACAAGTAAAT 


TATCAGTACC 


ATCTATATTA 


ACAAAGGAAT 


• 1200 


CCATTAAATA 


ACGTCCCAAT 


TT 








1222 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1021 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

so 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
TTTGTTATTA TTACnTnAAA TAATTGCATT ACTTTTTACT GATGGTACAA CTTTCCATCC 

55 
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TTCTTTTGGC ACGACATAAT TGTCTTTATC TTGAACTAAA TATCCGCCAG ATACTGAAAC 180 

AAACTCTTCT TCGTTACTGT CTATAGTCAT ATCAATTTCT AATAATCTTA CATTCTTCTT 240 

TTGTTTTAAA ATATCTAATG CTTCATCTGT AAATTTTGGT GCAATAATGA CTTCCAAAAA 300 

GATACTATGC AATTGCTCTG CTAACTCAGG TGTTACAGCT CGGTTTAATG CAACAATTCC 360 

ACCAAATATT GATTGACTAT CCGCTT CAT A CGCATGTTGA AATGCTTGTT CTATCGTGTC 420 

ACCGATACCA ACACCACATG GATTCATGTG TTTAACCGCA ACTGTAGCAG GTGTATCAAA 4 80 

CTTTTTAACT AAAGCTAGTG TAGCATCTGC ATCTTTAATA TTGTTATAGC TTAATTGTTT 54 0 

15 CCCATGTAAT TGTTTAGCGC CTGCAATCGT GTGCTTAGCA TTCGAAGTTC TCACAAAATA 600 

CGCTGATTGT TGTGGATTTT CTCCATATCT TAAAGTTTCT TTATCCCCTT TAAAGAAACG 660 

TACAATCGCT TGATCATATT CTGCAGTATG CTCAAAAACT TTAATCATTA ATGATTGTCT 720 

20 ATATGA CTCA TCTAACGAAT CGT TTC 'TTAA TCGCGTCAAT ACTTCTTGAT AATCTG CCGG 780 

ATGTACAATT GTTGTTACAT GTTTATAGTT TTTAGCTGCA GCACGTAACA TTGTTGGACC 840 

ACCAATATCA ATATTTTCAA TTGCTTCGTC CATCGTCACA TCAGGGTTTG CAACAGTTTG 900 

TTGGAATGGA TATAAATTAA CTACTACCAT ATCAATTAAA TCTATATGTT GTTCTGATAA 960 

TTCATTTAAA TGCTGCGGTT TATTTCGATC AGCTAAAATG CCACCATGAA CAGCCGGATG 1020 

T i 1021 
(2) INFORMATION FOR SEQ ID NO: 15: 

^(i) . SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 759 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

40 " (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

TCATTCACTC CTAAATTGTT ATTACACTAT TACACaTAGC TAATCATCAA TGTGAAATCA 60 

CCTTCAAAGA CACTATCCAA ATCTTCAGAA GTCAAAATAA AGTTTGTACC AGTAGTCAGT 120 

TTGAAAATTT CACCATCGAC AATCATTTGC CCTTCGCCTT CCAACACTGT AACTAAACAG 180 

AACTCTCTAG GCTTCATATA ATTTAACGTG CCAGAAATTT CCCATTTAAC CAATGTAAAG 240 

AAATCATTCG ATACAATGTG TGTACACTTA TGGTTTTCAA TAATTTCGCT TTCAGGCAAA 300 

ATATTAGGTA ATGGTGCATT GTACTGAATA ACGTCTAAAG CTTTTTCAAT ATTTAACGGT 360 

CTATCATTAT ATTGATTATC TTGACGATTG AAATCATAAA GTCTATATGT AATGTCTGAC 420 
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ATAAAAtAGa ATTCy CCAGG kTTTACtTTA 
TGTTGAACAT GATTCGCAAC TTCTTCTCTA 
5 GCATCTTCTT CTGCAT CT AT AATATACCAA 

TGCTCATAAG CATAAGAATT ATCAGGGTGC 
ACTATTTT AG TTAGAAGCGG AAAATCTTTG. 

10 

TCTGACCAAA TACGGTCTAA TGTTTGACCT 
CCATTTGGAT GTGCTGACAC ACACCAACAT 

15 TCCAAACTCA CTTAGACGTT GACCGCCCCA 
TAATGGCATT GTTGCACCTC CATTGTGATT 
TCCATTATAT TTTGATTTTG TTCTCATTTA 

20 TAACTATTAG TGATTGTACC AT ATTTACTA 

CACTTAAATT TACAGTACTT TAACATTTTC 
CTTACATTTG TACATATTTC CCTTTAAATT 

25 CTTTAATAGT TGTGCCATAC ■ ATTGTTCAAA 

TTATTCATAC TTATAATTCA TCATTTTCAA 
TTCAAATCAT ATTTACTATC CTTATTAATC 

30 

TTTAATGTCC TGATCACCAC TAATAATTTG 
GACAATTTCT TTTAATACTG TCGCAACATC 
ATATTGTGCA GCTTCTATCT TTCCAGATCC 

35 

AATTGTATAA TTCAAACCTG nAACGTCTTA 
TATATGGCTT TAAATCACCG CTATCATCAA 

40 CCATGACATA GTGTTTAATA TTGGCCTCTT 

CTAAATCGAC AATAATTGTT TTATCTGCAC 
TAACTTTATC GAATGGTTTA AACGTCTCAG 

45 CAACAAGAAT TGCTTTCATA CCTTGTGATT 

CACCAGCAGT AAATGGTACA TTTTCTTTTG 
CGCCATTAGC ACCTATAACC AAAATATTCA 

50 

ATGCCATACC ACTTTATGAG * ATATGTAAAA 
ACTACTGGGA ACGTATTAAA TTAATATATG 

55 



At a t ATCyAA gTAtCGaCtC tATCGTTCCG 54 0 

GACTCTGCTA ATGTCCCtAT AACTATTTCT 600 

CATTCAGATT TGCCATATTG CCCgTTTTCA 660 

ACATGAATAG AAAGTGATTC TCTTGCATCC . -720 

CTTGGGAAAT CACCAAACAA TTCACGATGT 7 80 

TGATATGGTC CATTAATAAT CTCGCTCGTA 84 0 

TCCCCCAGTT- GTATCATTGT CTAATTGATA 900 

TAATTTTGTT TTTAAAATTG GTTGTAAAAA 960 

AAGTAAGCAA TAGAACTCTG ATGTTGTTGT 1020 

CATCGTATTA TTAACTTCCA CATTTCAAAT 1080 . 

ACATTGCAGT ACTGCCAATT AAAAGnGCTT 114 0* 

AAAAATTTAT AGCATAGAGA TTATATCTCT ' 1200 

TACTCGCCCA TTATACCAAT TAATAaACAA 1260 

TTCTTTGTAA AACGCATAGA CAATACGTAC 1320 

AAAATAACGA GTTACX3AAAA AGTAACCCGC ,1380 

CGTTTCATTT TCAAATTGAG TTAAAGCATC 1440 

AAACTCTTGG TGATTAAAAT GATTGGATGT 1500 

TTCTCTAGGA ATTTCAC CTT TACCATCAAA 1560 

TGCTGCATTT GTAAGTGCCC CTGGATGTAA 1620 

AATAGTCATC AGCGTAATGT TTAGCTATTG 1680 

AAGCCTGACG TCTCGAATCA TATGTTGAAA 1740 

TACTCGCAAT CATTGATTTA ACAGCACCAT 1800 

CCGTGTTCCC TCCAGAACCT . ACTGAAAAG A 1860 

TTAAAGTCTC TATTGAATCA TTTTCAACAT 1920 

TTAACGCATT AAGTTGATCT GATTGCCTAA . 1980 

CTAATTGTTG CACTAGTAAC GAACCTACAC . - 2040 

TTTACAACAC TCTCCTATkT ATTATTCTCT 2100 

CTTGTTACAA CTATAAAAAT CAATTGACAT 2160 

AACAAATATT CATATGAAAG GATTGTCATA 2220 
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tCaAGGCATT AGcGATTACA ATCGAATACG TATCaTGGAA TTGTTATCaG TCAGCGAAgC 2340 

AAGTGTTGGT CACATTtCAC ATCAATTGAA TTTATCTCAA TCAAATGTCT CGCACCAATT 2400 

5 

AAAATTACTT AAAAGTGTGC ATCTTGTGAA AGCAAAACGA CAAGGCCAAT CAATGATTTA 2460 

TTCATTAGAT GACATCCACG TAGCAACTAT GTTAAAGCAA GCCATACATC ACGCGAATCA 2520 

TCCTAAAGAA AGTGGGTTAT AATATGTCTC ATTCACATCA TCATCATGAC GATATGCATA 2580 

10 

GTCATGTAAC TACAAATAAT AAGAAAGTAT TGTTTATATC GTTTTTAATA ATCGGTCTAT 264 0 

ATATGTTTAT CGAAATCATC GGCGGTCTCC TTGCtAACAG CTTGGCATTA CTATCTGACG 2700 

1S GTATCCATAT GTTTAGCGAC ACATTCTCAT TAGGTGTTGC ACTTGTCGCA TTTATTTATG 2760 

CTGAAAAGAA TGCCACAACT ACAAAAACAT TTGGTTATAA ACGTTTCGAA GTACTCGCAG 2820 

CGTTATTTAA CGGTGTAACG CTTTTTGTAA TAAGTATTTT GATTGTTTTT GAAGCGATTA 2880 

20 AACGTTTCTT TGTTCCTTCT GAAGTTCAAT CAAAAGAAAT GTTAATCATT AGTATTATCG 2940 

GTTTAATTGT CAATATCGTT GTTGCATTCT TTATGTTTAA AGGCGGCGAC ACTTCACACA 3000 

ATTTAAATAT GCGTGGTGCT TTTCTACATG TTATCGGAGA CTTATTAGGT TCAGTTGGCG 3060 

26 CCATTACTGC AGCTAkTTTA ATTTGGGCAT TTGGATGGAC AATCGCCGAT CCTATCGCAA 3120 

GTATTTTAGT TTCCGTTATT ATTTTAAAAA GTGCTTGGGG TATCACAAAA TCTTCAATTA 3180 

ACATTTTAAT GG aAGG CACA CCAAGTGATG TTGATATAGA TGAAGTTATA ACTACTATTA 3240 

30 

AAAAGGATTC ACGAATACAA AGTGTGCATG ATTGCCATGT TTGGACAATT TCAAATGATA 33 00 

TGAATGCATT GAGTTGTCAT GTTGTTGTAG ACCATACATT GACAATGAAA GAATGTGAAT 3360 

TATTATTAGA AAaCATTGAG CATGATTTAT TACATTTAAA TATTCACCAT ATGACTATTC 342 0 

3S 

AATTAGAAAC GCCTAATCAC AAACATGATG AATCGATTAT ATGTTCAGGA ACACATAGTC 34 80 

ATTCACATAA CCATCATGCT CATCATCACG CGCATGTACA TTAATAATTT TAACCTACTG 354 0 

40 CCATTGCAfC GATTAAACTT TTCAATGGCA GTAGGTTTTT TATGTCTTTA TGGCGACTTG 3600 

TTTGGTCTTT GATGATGCAA TGTTTATTAA CAAATTTTCA ACTATTATTT CTTACATTAG 3660 

TCATATTTTT GACAATTTAC TATTATAATT CTCTAACTTT AGTCACTTTA ATTAATTTTT 3720 

45 ATTAGATATT AATATGAAAA TAACGTGTTT TTTGTTATT 3759 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 
SO (A) LENGTH: 13086 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : double 
(D) TOPOLOGY: linear 
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(xi) SEQUENCE. DESCRIPTION: SEQ ID NO: 16: 

TAATTATCGC GCATAACAAA ACATT AG CAG GACAATTATA TAGTGAGTTT AAAGAATTTT 60 

5 TTCCTGAAAA CAGGGTGGAA TACTTTGTAA GTtACTATGA TTATTATCAn CCAGAGG CAT 120 

ACGTACCGTC TACTGACACT TTTATTGAAA nAGATGC CTC AATCAnTGAT GAAATTGATC 180 

AACTACGACA TTCTGCTACA AGTGCATTAT TTGAACGCGA TGATGTAATT ATTATTGCTA 240 

10 

GTGTAAGTTG TATATATGGT TTAGGTAATC CTGAAGAATA TAAAGATTTA GTAGTAAGTG 300 

TTCGAGTTGG TATGGAAATG GATAGAAGTG AATTACTTAG AAAACTTGTc AGATGTGCAA 360 

TATACACGAA ATGACATCgA TTTcCAACGA GGAACGTTTC GAGTGCGTGG TGATGTAGTG 420 

15 

GAAATATTCC CAGCCTCTAA AGAAGAACTT TGTATAAGGG TTGAGTTTTT CGGCGATGAG 4 80 

ATTGACCGTA TCCGAGAAGT TAACTACCTA ACAGGTGAAG TGTTGAAAGA AAGAGAACAT 54 0 

2Q TTTGCGAT AT TCCCAGCTTC TCACTTCGTA ACACGTGAAG AAAAGTTGAA AGTTGCGATT 600 

GAACGTATTG AAAAAGAATT GGAAGAACGA TTGAAAGAAT TACGAGATGA GAATAAATTA 660 

CTAGAAGCGC AAAGGTTAGA ACAGCGTACC AACTATGATT TAGAAATGAT GCGAGAGATG 720 

25 GGATTCTGTT CAGGAATTGA AAACTATTCC GTACATTTAA CTTTGCGACC ACTGGGTTCG 780 

ACACCATATA CTTTATTGGA TTACTTTGGC GATGATTGGT TAGTAATGAT TGATGAATCA 840 

CATGTGACAT TACCGCAAGT TCGAGGCATG TATAACGGAG ACAGAGCGCG TAAACAAGTT 900 

30 ' TTGGTGGATC ATGGGTTTAG ATTACCGAGT GCATTAGATA ACCGTCCACT TAAATTTGAA 960 

" GAATTTGAAG mAAAGACAAA ACAACTTGTG TATGTATCTG CAACGCCTGG ACCATACGAA 1020 

ATTG AACAT A CGGATAAGAT GGTTGAACAA ATTATTCGTC CTACTGGTTT ACTGGATCCT 1080 

35 

AAGATTGAGG TTAGACCTAC TGAAAATCAA ATTGACGATT TATTAAGTGA AATTCAAACA 114 0 

AGAGTgAGCG TAATGAACGC GTACTTGTTA CAACGCTCAC TAAAAAGATG AGTGAAGATT 1200 

aACCACATAC ATGAAAGAaG CGGGTATTAA aGTtAATTAT CTGCATTCAG AAATCAAGAC 1260 

40 

ATTAGAACGA ATTGAAATAA TTAGAGACTT ACGAATGGGT ACATATGATG TTATCGTAGG 1320 

TATTAATTTA TTAAGAGAGG GTATTGATAT ACCAGAAGTT TCTCTAGTTG TCATATTAGA 1380 

45 TGCAGATAAA GAAGGGTTTT TACGTTCTAA CCGCTCATTA ATTCAAaCAA TAGGTAGAgC 1440 

TGCGCGTAAC GATAAaGGTG AAGTCATTAT GTATGCCGAT AAAATGACTG ATTCGATGAA . 1500 

GTATGCAATT GATGAGACAC AACGTCGTCG AGAAATACAG ATGAAACATA ATGAAAAACA 1560 

SO TGGTATTACA CCTAAAACAA TTAATAAAAA AATACATGAT TTAATTAGTG CTACTGTTGA 1620 

AAATGACGAA AATAATGACA AAGCACAAAC TGTGATACCT AAGAAGATGA CGAAAAAAGA 1680 

55 
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TTTCGAGAAA GCTACAGAAT 


TAAGAGATAT 


GTTATTTGAA TTAAAAGCAG 


AAGGGTGACA 


1600 




AGTAAATGAA AGAACCATCC 


ATAGTAGTAA 


AAGGTGCTCG 


TGCGCATAAC 


TTGAAAGATA 


1860 


5 


TTGATATCGA ACTACCTAAA 


AaTAAATTAA 


TTGTTATGAC 


AGGTTTATCT 


GGGTCAGGTA 


1920 




AATCGTCATT 


AGCATTCGAT 


ACTATATATG 


CTGAAGGACA ACGACGTTAT 


GTTGAATCAT 


1980 


10 


TAAGTGCCTA 


TGCGCGTCAA 


TTTTTAGGCC 


AAATGGACAA 


ACCAGATGTT 


GATACAATTG 


2040 


AAGGATTATC 


GCCAGCAATT 


TCAATAGATC 


AAAAAACAAC 


AAGTAAAAAT 


CCAAGATCAA 


2100 




CTGTAGCAAC AGTAACAGAA 


ATATATGATT 


ATATACGTTT 


GTTATATGCA 


CGTGTTGGTA 


2160 


15 


AACCTTACTG 


TCCAAATCAC 


AATATAGAAA 


TTGAATCGCA 


AACAGTACAA 


CAAATGGTTG 


2220 




ACCGCATTAT 


GGAATTAGAG 


GCACGTACAA 


AGATTCAATT 


ATTAGCACGT 


GTCATCGCTC 


2280 




ATCGTAAAGG 


TAGTCATGAA 


AAG CTAATCG 


AAGATATTGG 


TAAAAAAGGT 


TATGTACGTT - 


2340 


20 


TAAGAATCGA 


TGGCGAAATT 


GTTGATGTAA 


ATGATGTACC 


TACTTTAGAT 


AAGAACAAGA 


2400 




ATCATACAAT 


AGAAGTTGTT 


GTAGACCGAT 


TAGTTGTTAA 


AGATGGAATT 


GAAACACGAC 


2460 




TAGCTGACTC 


TATAGAAACT 


GCCTTAGAGC 


TTTCAGAAGG 


ACAATTAACA 


GTCGATGTCA 


2520 


25 


TTGACGGGGA 


AGACCTTAAG 


TTTTCAGAAA 


GCCATGCTTG 


TCCTATATGT 


GGATTTTCAA 


2580 




TCGGAGAGTT 


AGAACCAAGA 


ATGTTTAGCT 


TTAACAGTCC 


TTTTGGTGCT 


TGTCCGACAT 


2640 




GTGATGGCTT 


AGGCCAAAAG 


TTAACAGTCG 


ATGTAGACTT 


GGTTGTTCCC 


GACAAAGATA 


2700 


30 


AGACGCTAAA 


CGAAGGTGCA 


ATAGAACCTT 


GGATACCGAC 


GAGTT CTGAT 


TTTTATC CAA 


2760 




CATTGTTAAA 


ACGTGTTTGT 


GAAGTTTATA 


AAATCAATAT 


GGATAAACCT 


TTTAAAAAGT 


2820 


35 


TAACAGAACG 


TCAACGTGAT 


ATTTTATTGT 


ATGGTTCTGG 


TGACAAAGAA 


ATTGAATTTA 


2880 


CATTTACAGA 


ACGTCAAGGT 


GG TACT AG AA 


AACGAACAAT 


GGTTTTCGAG 


GGTGTAGTTC 


2940 




CTAAIATAAG 


TAGACGATTC 


CATGAATCTC 


CTTCAGAATA 


TACACGTGAA 


ATGATGAGTA 


3000 


40 


AATATATGAC 


TGAACTACCT 


TGCGAAACTT 


GT CATGG AAA 


GCGATTGAGT 


CGTGAAG CkT 


3060 


TATCTGTTTA 


TGTAGGTGGT 


TTAAATATTG 


GTGAAGTAGT 


CGAATATTCA 


ATCAGTCAAG 


3120 




CGCTGAACTA 


TTATAAAAAC 


ATTGATTTGT 


CAGAACAAGA 


TCAAGCGATT 


GCAAATCAAA 


3180 


45 


TATTGAAAGA 


AATTATTTCC 


vwV^ X X J. 


TTTTAAATAA 


TGTGGGACTT 


onnlnl 1 Lf\J\ 


i^4U 




CGTTAAACAG 


AG CTT CAGGT 


ACACTTTCAG 


GTGGTGAAGC 


ACAACGTATT 


CGATTAGCAA 


3300 




CGCAAATTGG 


GTCGCGTTTG 


ACTGGTGTCT 


TATATGTATT 


AGATGAGCCA 


TCAATTGGAC 


3360 


50 


TGCATCAAAG 


AGATAATGAT 


CGATTAATTA 


ATACACTTAA 


AGAAATGAGA 


GATTTAGGAA 


3420 




ATACTTTAAT 


TGTAGTTGAA 


CACGATGATG 


ATACAATG CG 


TGCGGCTGAT 


TACTTAGTGG 


3480 
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AGGTAATGAA 


AGATAAAAAA 


TCATTAACAG 


GACAATACTT 


GAGTGGTAAG 


AAACGTATTG . . 


3600 




AAGTACCTGA 


ATATCGCAGA 


CCGGCTTCAG 


ATCGTAAAAT 


TTCTATACGT 


GGAGCTAGAA 


3660 


5 


GCAACAATCT 


TAAAGGGGTT 


GATGTGGACA 


TACCACTATC 


AATCATGACG 


GTTGTTACAG 


3720 




GTGTATCAGG 


TTCTGGTAAA 


AGCTCATTAG 


TAAATGAAGT 


ATTATACAAA 


TCATTAGCTC 


3780 




AAAAAATTAA 


TAAATCTAAA 


GTAAAGCCAG 


GATTGTACGA 


TAAGATTGAA 


GGTATTGATC 


3840 


10 


AACTTGATAA 


AATTATTGAT 


ATTGATCAAT 


CACCAATAGG 


TAGAACGCCA 


CGCTCTAATC 


3900 




CAGCAACATA 


TACTGGTGTG 


TTTG ATGATA 


TACGTGATGT 


GTTTGCGCAA 


ACAAATGAAG 


3960 


15 


CTAAAATTCG 


AGGATATCAA 


AAAGGGCGTT 


TTAGTTTTAA 


TGTAAAAGGT 


GGACGCTGTG 


4020 


AAgcTTGTAA AGGTGACGGT 


ATTATTAAAA 


TTGAAATGCA 


TTTTTTACCT 


GATGTTTATG 


4080 




TTCCTTGTGA 


AGTGTGTGAT 


GG T AAACG AT 


ATAATCGTGA 


GACACTAGAG 


GTTACTTACA 


4140 


20 


AAGGTAAAAA 


TATTGCTGAC 


ATTTTAGAAA 


TGACTGTTGA 


AGAAGCAACA 


CAATTTTTTG 


4200 




AAAATATTCC 


TAAGATTAAG 


CGCAAGTTAC 


AAACACTAGT 


TGATGTTGGT 


CTTGGATACG 


4260 




TCACATTAGG 


TCAACAAGCT 


ACAACGTTAT 


CAGGTGGTGA 


GGCTCAACGT 


GTGAaACTTG 


4320 


25 


CATCTGAACT 


TCATAAACGT 


TCAACTGGTA 


AATCTATTTA 


TATCCTAGAT 


GAACCGACAA 


4380 




CAGGGTTACA 


TGTTGACGAT 


ATTAGTAGAT 


TATTAAAAGT 


ATTAAACCGA 


TTAGTTGAAA 


4440 






J. ul iul<nnl -L 


ATTGAACATA 


ACCTAGATGT TATCAAAACA 


GCAGACTATA 


4500 


30 


TTATAGACTT 


AGGTCCTGAA 


GGTGGTAGTG 


GCGGTGGTAG 


TATTGTTGCG 


•ACTGGCACAC 


- 4560 




CCGAAGATAT 


TGCTCAGACA 


AAGTCATCAT 


ATACAGGAAA 


GTATTTAAAA 


GAAGTACTTG 


4620 


s~ ' 


AACGAGATAA 


ACAAAATACT 


GAAGATAAAT 


AAGATTAAAA 


GAAGTGAAGG 


ATGTTATAAA ~ 


4680 


35 


TTTATCCTTC 


GCTTCTTTTT 


ATTAATTTAG 


TAATGAATAG 


TAGAAAGAAA 


AGATGCGTAA 


4740 




AAAGAATTAT 


GTTAAGATAG 


GGTCAATCTA 


GAGTAGTTAA 


ACATAAATCG 


AACTGGGAGT 


4800 


40 


GGGACAG AAA 


TGATAAAGAA 


TCACTAATGA 


TTTATTATGT 


AGTGGTTCTT 


TGTCATTAGC 


4860 


CACAGCTATT 


GTGTACTTAA 


AAATAGGaat 


GCaTgAGTGC 


AACTCATGCA 


TAAGaAATAC 


4920 




TAATTTCTAA 


AGAAAAAGTA 


TTTCTTTATG 


TTGGGGCCCC 


GCCAACTTGC 


ATTGTTTGTA 


4980 


45 


GAATTTCTTT 


TCGAAATTCT 


TTATGTTGGG 


GCCCCGCCAA 


CTTGCATTGT 


TTGTAGAATT 


c rt a rt 




TCTTTTCGAA 


ATT CTTTATG 


TTGGGGCCCC 


GCCAACTAAT 


TCCAATATAT 


CATTGTAGAG 


5100 




CTTAGGTCAT 


TGATTTTTGG 


CTCGGACTTT 


TATGGCGATA 


TGAACCATGT 


AAATTAAGCA 


5160 


50 


AGCAATAAAT 


TAATGATTGA 


TATTGACTTG 


TAAAATAATA 


ACAATAATGA 


ACAATTAATA 


5220 




TTTATTTTAG 


CTTTTCAATG 


TAGATTGGTG 


TTATATTTTT 


GATATGATAA 


GAAGAGATGT 


5280 
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ACATTAAAGT TAGATTTAAT CGCTGGTGAA GAAGGACTAT CGAAGCCAAT TAAAAATGCT 5400 

GATATATCAA GACCGGGCTT AGAGATGGCA GGTTATTTTT CACATTATGC GTCAGATAGA 5460 

5 ATACAACTAT TAGGAACAAC GGAACTATCG TTTTACAATT TATTACCAGA TAAGGATCGC 5520 

GCAGGTCGTA TGCGTAAACT ATGCAGACCA GAAACGCCTG CAATTATTGT GACACGTGGA 5580 

TTGCAGCCAC CAGAAGAATT AGTTGAAGCT GCAAAAGAAT TAAATACCCC ACTTATAGTT 5640 

10 

GCTAAAGATG CGACTACAAG TTTAATGAGT CGCTTAACAA CGTTTTTAGA GCATGCACTT 5700 

GCAAAGACGA CATCTTTACA TGGTGTTTTA GTAGATGTTT ACGGTGTTGG TGTACTAATT 5760 

ACCGGTGATT CAGGAATAGG TAAAAGTGAG ACTGCGTTGG AATTAGTTAA ACGTGGG CAT 5820 

IS 

AGATTAGTAG CAGATGATAA TGTAGAAATA CGTCAAATTA ATAAAGATGA ACTAATAGGG 5880 

AAACCACCAA AGTTAATAGA ACATCTATTA GAAATACGTG GACTAGGTAT TATCAATGTT 5940 

2Q ATGACTTTAT TTGGCGCGGG TTCAATATTA ACTGAAAAAC GAATTAGATT AAATATTAAT 6000 

TTGGAAAACT GGAACAAGCA AAAGTTATAT GACCGCGTAG GTCTTAATGA AGAGACGCTA 6060 

AGTATTTTAG ATACTGAAAT CACTAAAAAA ACAATACCTG TAAGACCTGG TAGAAATGTT 6120 

25 GCGGTAATTA TTGAGGTCGC TGCAATGAAC TATCGATTAA ATATCATGGG CATTAACACG 6180 

GCCGAAGAAT TTAGTGAAAG ATTAAATGAA GAAATTATCA AGAACAGTCA TAAGAGTGAG 6240 

GAGTAGGTTG AATGGGTATT GTATTTAACT ATATAGATCC TGTGGCATTT AACTTAGGAC 63 00 

30 CACTGAGTGT ACGATGGTAT GGAATTATCA TTGCTGTCGG AATATTACTT GGTTACTTTG 6360 

TTgCACAACG TGCACTAGTT AAAGCAGGAT TACATAAAGA TACTTTAGTA GATATTATTT 6420 

TTTATAGTGC ACTATTTGGA TTTATCGCGG CACGAATCTA TTTTGTGATT TTCCAATGGC 64 80 

35 

CATATTACGC GGAAAATCCA AGTGAAATTA TTAAAATATG GCATGGTGGA ATAGCAATAC 654 0 

ATGGTGGTTT AATAGGTGGC TTTATTGCTG GTGTTATTGT ATGTAAAGTG AAAAATTTAA 6600 

ACCCATTTCA AATTGGTGAT ATCGTTGCGC CAAGTATAAT TTTAGCGCAA GGAATTGGAC 6660 

40 

GCTGGGGTAA CTTTATGAAT CACGAGGCAC ATGGTGGATC GGTGTCACGC GCTTTTTTAG 672 0 

AACAATTACA TTTGCCTAAT TTTATAATAG AAAATATGTA TATTAACGGC ; CAATATTATC 6780 

ATCCAACATT CTTATATGAA TGCATTTGGG ATGTCGCTGG ATTTATTATC TTAGTTAATA 684 0 

45 

TTCGTAAACA TTTAAAATTA GGAGAAACAT TCTTTTTATA TTTAACTTGG TATT CAATTG 690 0 

GTCGATTCTT TATAGAAGGA TTACGTACAG ATAGCTTAAT GCTCACAAGT AATATTAGAG 696 0 

50 TTGCACAATT AGTATCAATT CTTTTAATTT TAATAAGTAT AAGTTTAATT GTATATAGAA 702 0 

GGATTAAGTA TAATCCACCG TTGTATAGCA AAGTTGGGGC GCTTCCATGG CCAACAAAAA 7080 
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TTATGGCGTG 


TATACCGTCT 


TGTTAAATTT 


TCGAAAGTTT 


TTAAGAATGT 


AATTATCATT 


7200 




GAATTTTCGA 


AATTTATTCC 


AAGTATGGTA 


CTGAAAAGAC 


ATATATATAA 


ACAACTTTTA 


7260 


5 


AATATTAATA 


TCGGTAATCA 


ATCGTCGATA 


GCTTATAAAG 


TAATGTTAGA 


TATTTTTTAC 


7320 




CCAGAACTGA 


TTACGATTGG 


TAGTAACAGT 


GTTATTGGTT 


ACAATGTAAC 


AATTTTGACG 


7380 


10 


CATGAAGCAT 


TAGTTGATGA 


ATTTCGTTAT 


GGACCAGTGA 


CGATAGGATC 


TAACACTTTG 


7440 


ATTGGTGCAA 


ATGCTAC CAT 


TTTACCCGGT 


ATAACGATTG 


GTGACAATGT 


AAAAGTTGCA 


7500 




GCTGGTACGG 


TTGTTTCAAA 


AGATATACCG 


GATAATGGAT 


TTGCATATGG 


CAACCCTATG 


7560 


15 


TATATAAAAA 


TGATTAGGAG 


GTGACAATTT 


TATGGCGCAA 


AAGAATAATA 


ATGTAATTCC 


7620 


AATGACTTTT 


GATGATGCAT 


TTTATCGTAA 


AATGGCTAAA 


CAGAAGTTTA 


AACAAAGAGA 


7680 




ATATAAACGA 


GCTGCTGAAT 


ACTTTGAAAA 


AGTGTTAGAA 


TTGTCACCTG 


ATGATCTGGA 


7740 


20 


AATTCAAATT 


GATTATGCAC 


AATGTCTAGT 


GCAACTTGGT 


ATTGCTAAAA 


AAGCAGAACA 


7800 




TTTATTTTAT 


GACAATATTA 


TTTATAATAG 


GCATCTAGAA 


GAT AG CTTTT 


ATGAATTGAG 


7860 




TCAGCTCAAC 


ATTGAAGTTA 


ACGAACCAAA 


CAAGGCATTC 


TTGTTTGGTA 


TTAATTATGT 


7920 


25 


TATTGTTAGC 


GACGACCAAG 


ATTATAGAGA 


TGAATTAGAT 


CAAATGTTTG 


ATGTGAAATA 


7980 




TCAAAGTGAA 


GAACAAATTG 


AACTTGAAGC 


TCAATTGTTT 


GTAGTT CAAA 


TACTATTCCA 


8040 




ATATCTTITT 


TCTCAAGGTC 


GATTAAAAGA 


TGCAAAGAAT 


TATGTCTTAG 


ATCAACCACA 


8100 


30 


AGAAGTTCAA 


GATCATCGTG 


TAGTACGTAA 


TTTATTGGCA ATGTGTTATT TATATCTCGG 


: : 8160 




TGAATATGAT ACgGCTAAAG 


CATTGTACGA 


aGCACtATTA 


CAAGAGGATA 


GTACaGATAT 


8220 




ATATGCATTA 


TGCCATTATA 


CTTTGCTACT 


TTATAACACT 


AAGGAAAATG 


AACAATATCA 


8280 


35 


AAAATATTTA 


AAAATATTAA 


ACAAAGTTGT 


ACCTATGAAT 


GACGATGAAA 


GTTTTAAATT 


8340 




AGGTATTGTA TTAAGTTATT 


TAAAGCAGTA 


TCGTGCATCA 


CAACAATTGT 


TGTACCCTTT 


8400 


40 


ATATAAAAAA 


GGGAAATTTT 


TATCAATTCA 


AATGTACAAT 


GCTTTAGCAT 


ATAATTATTA 


8460 


TTATTTAGGT 


GAAGAAGACG 


AAAGTCATTA 


CTACTGGGAT 


AAATTGAAGC 


AAATTTCTAA 


8520 




AGTGGAAATT 


GGACATG CGC 


CTTGGGTAAT 


TGAAAATAGC 


AAAGAAGTTT 


TTGACCAACA 


8580 


45 


TATTTTGCCA 


TTACTTCAAA 


GTGATGACAG 


TCATTATCGT 


TTATATGGTA 


TTrrrrrATT 


8640 




GGATCAATTA 


AATGGTAAAG 


AAATTGTGAT 


GACGGAAAGT 


ATTTGGCAGG 


TTTTGGAAAA 


8700 




TCTAAATAAT 


TATGAGAAAT 


TGTATTTAAC 


GTATTTAGTT 


CAAGGTTTAA 


CGCTCAATAA 


8760 


50 


ATTAGACTTC 


ATTCATCGCG 


GCTTATTAAC 


G CTTTAC CAT 


AATGAATTAT 


TTGTAAGTGA 


8820 




AAATGATGTA 


ATGGTTGCAT 


GGATTAATCA 


AGGTGAACTC 


ATAATTGCTG 


AAAAAGTAGA 


8880 
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10 



15 



20 



25 



35 



45 



TCGAAACGTT ACAAAGAAGC AAATTACAAC ATGGTTAGGC ATAACACAAT ATAAACTGAA 9000 

CAAAATGATT GAATTTCTCT TGAGCATATA GATTTATGAA AAGTTAGATT TATTATATAA 9060 

TGCGCATAAT GATTAATAAT GAGGAGGCGT TAATAAAATG ACTGAAATAG ATTTTGATAT 9120 

AGCAATTATC GGTGCAGGTC CAGCTGGTAT GACTGCTGCA GTATAGGCAT CACGTGCTAA 9180 

TTTAAAAACA GTTATGATTG AAAGAGGTAT TCCAGGCGGT CAAATGGCTA ATACAGAAGA 924 0 

AGTAGAGAAC TTCCCTGGTT TCGAAATGAT TACAGGTCCA GATTTATCTA CAAAAATGTT 9300 

TGAACACGCT AAAAAGTTTG GTGCAGTTTA TCAATATGGA GATATTAAAT CTGTAGAAGA 9360 

TAAAGGCGAA TATAAAGTGA TTAACTTTGG TAATAAAGAA TTAACAGCGA AAGCGGTTAT 9420 

; TATTGCTACA GGTGCAGAAT ACAAGAAAAT TGGTGTTCCG GGTGAACAAG AACTTGGTGG 9480 

ACGCGGTGTA AGTTATTGTG CAGTATGTGA TGGTGCATTC TTTAAAAATA AACGCCTATT 9540 

CGTTATCGGT GGTGGTGATT CAGCAGTAGA AGAGGGAACA TTCTTAACTA AATTTGCTGA 9600 

CAAAGTAACA AT CGTTCACC GTCGTGATGA GTTACGTGCA CAGCGTATTT T A CAAGAT AG 9660 

AGCATTCAAA AATGATAAAA TCGACTTTAT TTGGAGTCAT ACTTTGAAAT CAATTAATGA 9720 

AAAAGACGGC AAAGTGGGTT CTGTGACATT AAGGTCTACA AAAGATGGTT CAGAAGAAAC 9780 

ACACGAGGCT GATGGTGTAT TCATCTATAT TGGTATGAAA CCATTAACAG CGCCATTTAA 9840 

AGACTTAGGT ATTACAAATG ATGTTGGTTA TATTGTAACA AAAGATGATA TGACAACATC 9900 

AGTACCAGGT ATTTTTGCAG CAGGAGATGT TCGCGACAAA GGTTTACGCC AAATTGTCAC 9960 

TGCTACTGGC GATGGTAGTA TTGCAGCGCA AAGTGCAGCG GAATATATTG AACATTTAAA 10020 

CGATCAAGCT TAATTCGAAG TCGAATTAAG ATGTTGAGCT GTAAATTATT TGGATATTTA 10080 

TTTTAATAGT GTCATCACAG CGTTAAAATA ATGTCTTACT TTTAAATTAA AGCAAATTAT 1014 0 

ATAG5AAACT AGAACTTAGT ACGTATCATT TGTGCGTTTC AATGAGTTCT AGTTTTTTTA 102 00 

TATGTTATAT TAAACTTATA ACTTTATGGG AGTGGGACAG AAATGATAAA GAGCCACTAA 10260 

TGATTTATTA TGTAGTGGTT CTT AAA CATT AG C CACAGCT AATGTGTACT TAAAAATAGG 103 20 

AATACATGAG TAAAACTCAT GCATAAGAAA TACTAATTTC TATAGAAAAA GTATTACTTT 10380 

ATCGTTGTCC CACCCCAACT TGCACATTAT TGTAAG CTGA CTTTCCGCCA GCTTCTGTGT 10440 

TGGGGCCCCG CCAACTTGCA CATTATTGTA AGCTGACTTT TCGTCAgCTT CTGTGTTGGG 10500 

GCCCCGCCAA CTTGCACATT ATTGTAAGCT GACTTTTCGT CAGCTTCTGT GTTGGGGCCC 10560 

CGCCAACTTG CATTGTCTGT AGAAATTGGG AATCCAATTT CTCTATGTTG GGGCCCACAC 1062 0 

CCCAACTCGC ATTGCCTGTA GAATTTCTTT TCGAAATTCT CTGTGTTGGG GCCCACACCC 10680 
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ACTCGCATTG 


CCTGTAGAAT 


TTCTTTTCGA 


AATTCTCTGT 


GTTGGGGCCC 


CTGACTAGAG 


10800 




TTGAAAAAAG 


CTTGTTGCAA 


GCGCATTTTC 


ATTCAGTCAA 


CTACTAGCAA 


TATAATATTA 


10860 


5 


TAGACCCTAG 


GACATTGATT 


TATGTCCCAA 


GCTCCTTTTA 


AATGATGTAT 


ATTTTTAGAA 


10920 




ATTTAATCTA 


GACATAGTTG 


GAAATAAATA 


TAAAACATCG 


TTGCTTAATT 


TTGTCATAGA 


10980 


10 


ACATTTAAAT 


TAACATCATG 


AAATTCGTTT 


TGGCGGTGAA 


AAAATAATGG 


ATAATAATGA 


11040 


AAAAGAAAAA 


AGTAAAAGTG 


AACTATTAGT 


TGTAACAGGT 


TTATCTGGCG 


CAGGTAAATC 


11100 




TTTGGTTATT 


CAATGTTTAG 


AAGACATGGG 


ATATTTTTGT 


GTAGATAATC 


TACCACCAGT 


11160 


15 


GTTATTGCCT 


AAATTTGTAG 


AGTTGATGGA 


ACAAGGAAAT 


CCATCCTTAA 


GAAAAGTGGC 


11220 




AATTGCAATT 


GATTTAAGAG 


GTAAGGAACT 


ATTTAATTCA 


TT AG TTG CAG 


TAGTGGATAA 


11280 




AGTCAAAAGT 


GAAAGTGACG 


TCATCATTGA 


TGTTATGTTT 


TTAGAAGCAA 


GTACTGAAAA 


11340 


20 , 


ATTAATTTCA 


AGATATAAGG 


AAACGCGTCG 


TGGACATCCT 


TTGATGGAAC 


AAGGTAAAAG 


11400 




ATCGTTAATC 


AATGCAATTA 


ATGATGAGCG 


AGAGCATTTG 


TCTCAAATTA 


GAAGTATAGC ' 


11460 




TAATTTTGTT 


ATAGATACTA 


CAAAGTTATC 


ACCTAAAGAA 


TTAAAAGAAC 


GCATTCGTCG 


11520 


25 


ATACTATGAA 


GATGAAGAGT 


TTGAAACTTT 


TACAATTAAT 


GTCACAAGTT 


TCGGTTTTAA 


X X ^ Q U 




ACATGGGATT 


CAGATGGATG 


CAGATTTAGT 


ATTTGATGTA 


CGATTTTTAC 


CAAATC CATA 


1164 0 




TTATGTAGTA 


GATTTAAGAC 


CTTTAACAGG ATTAGATAAA GACGTTTATA ATT ATGTTAT 


1 1 "70 0 


30 


GAAATGGAAA 


GAGACGGAGA 


TTTTCTTTGA 


AAAATTAACT 


GATTTGTTAG 


ATTTT ATGAT 


11760 




ACCCGGGTAT 


AAAAAAGAAG 


GGAAATCTCA 


ATTAGTAATT 


GGCATCGGTT 


GTACGGGTGG 


11820 


35 


ACAACATCGA 


TCTGTAGGAT 


TAGCAGAACG 


ACTAGGTAAT - 


TATCTAAATG 


AAGTATTTGA 


11880 


ATATAATGTT TATGTGCATC ATAGGGACGC ACATATTGAA AGTGGCGAGA AAAAATGAGX 


11940 




CAAATAAAAG 


TTGTACTTAT 


CGGTGGTGGC 


ACTGGCTTAT 


CAGTTATGGC 


TAGGGGATTA 


12000 


40 


AGAGAATTCC 


CAATTGATAT 


TACGGCGATT 


GTAACAGTTG 


CTGATAATGG 


TGGGAGTACA 


12060 




GGGAAAATCa 


GAGATGAAAT 


GGATATACCA 


GCACCAGGAG 


ACATCAGAAA 


TGTGATTGCA 


12120 




GCTTTAAGTG ATTCTGAGTC 


AGTTTTAAGC 


CAACTTTTTC 


AGTATCGCTT 


TGAAGAAAAT 


12180 


45 


CAAATTAGCG 


GTCACTCATT 


AGGTAATTTA 


TTAATCGCAG 


GTATGACTAA 


TATTACGAAT . 


12240 




GATTTCGGAC 


ATGCCATTAA 


AGCATTAAGT 


AAAATTTTAA 


ATATTAAAGG 


TAGAGTCATT 


12300 




CCATCTACAA * 


ATACAAGTGT 


GCAATTAAAT 


GCTGTTATGG 


AAGATGGAGA 


AATTGTTTTT 


12360 


SO 


GGAGAAACAA ATATTCCTAA 


AAAACATAAA 


AAAATTGATC 


GTGTGTTTTT 


AGAACCTAAC 


12420 




GATGTGCAAC 


CAATGGAAGA 


AGCAATCGAT 


GCTTTAAGGG 


AAGCAGATTT 


AATCGTTCTT 


12480 
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GCGTTAATTC ATTCTGATGC GCCTAAGCTA TATGTTTCTA ATGTGATGAC GCAACCTGGG 126 0 0 

GAAACAGATG GTTATAGCGT GAAAGATyAT ATCGATGCGA TTCATAGACA AGCTGGACAA 12660 

CCGTTTATTG ATTATGTCAT TTGTAGTAGA CAAACTTTCA ATGCTCAAGT TTTGAAAAAA 1272 0 

TATGAAGAAA AACATTCTAA AC CAGTTG AA GTTAATAAGG CTGAACTTGA AAAAGAAAGC 1278 0 

ATAAATGTAA AAACATCTTC AAATTTAGTT GAAATTTCTG AAAATCATTT AGTAAGACAT 1284 0 

, AATACTAAAG TGTTATCGAC AATGATTTAT GACATAGCTT TAGAATTAAT TAGTACTATT 1290 0 

CCTTTCGTAC CAAGTGATAA ACGTnAATAA TATAGAACGT AATCATATTA TGATATGATA 12960 

ATAGAGCTGT GAAAAAAATG AAnATAGACA GTGGTTCTAA GGTGAATCAT GTTTTAAATA 13020 

AGAAAGGAAT GACTGTACGA TGAGCTTTGC ATCAGAAATG AAAAATGAAT T AA CTAGAAT 13080 

AGACGT 13086 
20 (2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1350 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : double 

25 (D) TOPOLOGY: linear 



10 



15 



30 



35 



40 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

CATTAGTCAT GAAAATAGCC GACAACTTCA TCTGTGAAAT- CACCGGCCTT TTATTTTAGC 60 

TAACTTTATT TCTGATTTTA CGATTTT AAT TGATCATACA GAGAAAGTGA TCTTTTTACA 120 

ATTTCTAAAA ACTCATGATC TATATTGGAC ATTTGATGAA AATAAGACAA AATGTTTTCT 18 0 

GTTAGCTTCT CTTGTTTTGG GAATGAATCA TCTTCTTTAA TCCAAATCGC TAATTCGCCT 24 0 

AATGSTGTTT TATCATCTTT AAATGTTTGT ATATATTCGT AAAAGCTCAT AGTATTC CTT. 300 

CTCTCAATTT ACTT AT ATAA ATCCTACCAC GAAAG CTTTC AAGAAAACAC AATTAAATGT 3 60 

CTATTTAGTG AACTTTTTAA GGTTGTGCAC TCTTTTAATG TCTGCCAATT AGGTCAATTA 42 0 

ATCATCACAA TGTACAATTA ACTCTATTTT CAGTTCATAT ACTCACACAC CGTTTTTGAA 4 80. 

CAACACATTA ACTTCTCATT TAGATAAAAC GCAAAAAAGC CTGGCACCAA TACAATAGAT 54 0 

GCCAGACTAA GAGTCTACTA TATAAATTTA TTTAGCGTAT GGTTTTACTT CGATTG CACC 600 

TTCATTTTCA TCATGAACAC CATGCTTATA ATAATCAATA TATTGTGGCT CTAAAGGCTT 6 60 

TCTGCCACGT ATAATGTCTG CTGCTTTTTC AGCTAACATT AAAACAGGTG CGTGTATATT 720 

GCCATTTGTC GTACGTGGCA TAGCTGATGC ATCAACTACA CGTAAATTTT CCATACCGTG 7 80 
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ACTACAAGAT GGGTGTAATG CTGTTTCACC ATCTCTACGA ACCCAATCAA GAATTTCTTC 900 

GTCTGTTTGC ACTTCTGGTC CTGGTGAAAT TTCTCCACCA TTGAATGGAT CCATTGCTTT 960 

5 . . . 

TTGAGATAAG ATATTTCTTG CTACACGAAT TGCTTCTACC CATTCTTTTT TATCTTCTTC 1020 

TGTTGATAAA TAATTAAAGC GGATACTTGG TTTTTCGAAT GGATCTTTAG ATTTGATTTT 1080 

CAAGCTACCA CGAGAGTTTG AATACATTGG TCCTACGTGA ACTTGATAAC CATGTGCGAC 114 0 

10 

CGCTGCCTTT TGACCATCAT AT CTTACAG C TATTGGTAAG AAATGGAACA TTAAGTTAGG 1200 

ATAAtCAACT TCGTTATTTG AACGTACAAA TCCGCCACCT TCAAAATGGT TAGATGCTGC 1260 

1S TGCACCTGTA CGTGTGAAAA TCCATTGTAA ACCAATAAAT GGcATGCGCT TGAtATCTAA 1320 

GCTTGGCtGt AATGATACAG GTTCCTTACA 1350 
■ (2) INFORMATION FOR SEQ ID NO: 18: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1376 base pairs 

(B) TYPE: nucleic acid 
CO STRANDEDNESS : double 
(D) TOPOLOGY: linear 

25 ' 1 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 





TAATGCTATT 


GGCAACACCA 


TATATGAAAn 


CTCCAAACGA 


TCCTAAACCG 


ACTATAGATT 


60 


30 


CACCAAATTT 


nACAATC CAT 


GAATAAAGTA 


GTGGCCATAA 


GAATAACAAT 


ATGACAACTA 


120 




AAAATGTACA 


GTAAAATGCA 


GTCATAATTG 


GAACTAGACG 


TTTACCACTA 


AAAAATGATA 


180 


35 


ATG CTAATGG 


TAATT CTGTT 


TCACTAAACT 


TATTGTATGC 


ATAAGCTGCT 


ATTAAACCTA 


240 


TTACAATACC 


AACAAAGACA 


TTGCGATTAT 


TCATCiTTrC 


AAAAGCTGAA 


TTTATTTCCG 


300 




ArGCTTTCAT 


TCCTAATAAA 


GGCGCTAATT 


TCATTGGTGA 


TAATACAACT 


GTAACTAAAA 


360 


40 


AATATCCTAA 


CGTrGCTGCA 


rGCGsGACTG 


CACCATCATT 


TTTCTTTGCC 


ATTCCTATAG 


420 




CTACACCAAT 


TGCAAATAAA 


ATACCTAATT 


GCTCTAAAAT 


CGTAGTACCT 


ACCGTAGTAA 


480 




AGAACATTGC 


GATTTTCGGC 


GTCGCATGAA 


GTGCATTTAA 


CGTATTACCA 


ATTCCGGCAA 


54 0 


45 


TAATTG CTGC 


AGCCGGTAAA 


ATGGCAACTG 


GTAACATTAA 


CGAACGCCCT 


AAATTTTGGA 


600 




AAAATTTATA 


CATTGAATGT 


CATCCTTCTT 


AAAATAATGT 


AGAAATATAA 


AGATTACTAA 


660 




TGTAACTAGA 


ATAACTACTT 


CGATACTCCG 


TTATAGTCAC 


CTAGGCTTAC 


TAACCAGCTA 


720 


50 


TATTTCTACC 


TCAAGTTATT 


TTATAAACTT 


TTTACAATTT 


CATGCAATTC 


TTGTTGTAAC 


780 




TTTGCTGTTC 


GTGTTTCAAT 


CTCTTTTGTA 


ATATAATCGA 


TACGCTCGTT 


TCGTTTTAAA 


840 
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AAAGACCGTG AATCTTAGTA GGACCAACAT AAGCAACAGG TAATATTGGT GACTTACTTA 960 

ACATTGCAAT TGTTGAAGCA CCaCGTTTCA AAGGTGCACC TTCTTGCGAT GTGCGAGAAC 1020 

5 GTGTTGGGAA GATACCAACT GTCTTATTAT CTTTCAACAA ATTGATTGGG CGTTTTAAAG 1080 

TACTAGGTCC TGGATTTTCA CGATCTACAG GAAATGCATT TAAAGACGTT AAAAATTTAC 114 0 

CAATCCATTT ATTTTTGAAT AATTCTTTTT T AGC CAT AT A ATGAATTTGA TTAGGATATA 1200 

10 

ATGCCATACC TAGCATAATG ACTTCGTTAT AACTTTCATG CGTACAAGTT ACGACATATT 1260 

TACTATCCTT AGGAATATTA TCTTTACCGA TTACGTATAA TGATTTTGAC ATTTTAACTA 1320 

AAATGAAATT CAAAATCTTA CTAATCACTG AATACATTGT GCCACCTACT TAACTT 1376 

15 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7363 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY : linear 

25 • (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

TTGTCATACC AATATTTTGT AAAATATGGA ACACAAGTAA AGTGACGAAA CCAACGATAA 6 0 

AGATTTTGTT AAATTGATCT TCAATTTTCG CAGCTAATCT TATTAGATGG AAGATTAAAA 120 

30 ATAAAAATAT TAAGATCAAT ATGACAGAAC CGATAAAGCC AAGTTCCTCT CCAATCACTG 180 

AAAAGATAAA GTCAGTATGA TTTTCAGGTA TATAAACTTC ACCGTGATTG TATCCTTTAC 24 0 

CTAGTAACTG TCCAGAACCG ATAGCTTTAA GTGATTCAGT TAAATGaTAG CCATCACCAC 300 

35 

TACTATATGT ATAGGGGTCA AGCCATGAAT TGATTCGTCC CATTTGATAC AGTTGGaCAC 360 

CTAAJAAATT TTCAATTAAT GCGGGTGCAT ATAGaATACC TAAAATGACT GTCATTGCAC 420 

CAACaATACC TGTAATAAAG ATAGGTGCTA AGATACGCCA TGTTATACCA CTTACTAACA 480 

40 

TCACACCTGC AATAATAGCA GCTAATACTA ATGTAGTTCC TAGGTCATTT TGCAGTAATA 540 

TTAAAATACT TGGTACTAAC GAGACACCAA TAATTTTGAA AAATAATAAC AAATCACTTT 600 

GGAATGATTT ATTGAATGTG AATTGATTAT GTCTAGAAAC GACACGCGCT AATGCTAAAA 660 

45 • ' ' - 

TTAAAATAAT TTTCATGAAT TCAGATGGCT GAATACTGAT AGGGCCAAAC GTGTACCAAC 720 

TTTTGGCACC ATTGATAATA GGTGTAATAG GTGACTCAGG AATAACGAGC AAGCCTATTA 780 

So ATAATAGACA GATTAAGAAA TACAATAAAT ATGTATAATG TTTAATCTTT TTAGGTGAAA 84 0 

TAAACATGAT GATACCTGCA AAAATTGCAC CTAAAATGTA ATAAAAAATT TGTCTGATAC 900 
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TTGCTAAAAC AGCTATAGTG GCTACTAATA CCCAGTCTAC TTTGCGAAnC aATGCTTATC 1020 

CGGCTGTTGA CGAGATGAAT AATTCATTGC AAACTCGTTT TATACTCACT AATGTTTATA 1080 

5 TCAATTTTAC ATGACTTTTT AAAAATTAGC TAGAATATCA CAGTGATATC AGCTATAGAT 114 0 

TTCAATTTGA ATTAGGAATA AAATAGAAGG GAATATTGTT CTGATTATAA ATGAATCAAC 1200 

ATAGATACAG ACACATAAGT CCTCGTTTTT AAAATGCAAA ATAGCATTAA AATGTGATAC 1260 

10 

TATTAAGATT CAAAGATGCG AATAAATCAA TTAACAATAG GACyAAATCA ATATTAATTT 1320 

ATATTAAGGT AGCAAACCCT GATATATCAT TGGAGGAAAA CGAAATGACA AAAGAAAATA 1380 

TTTGTATCGT TTTTGGAGGG AAAAGTGCAG AACACGAAGT ATCGATTCTG ACAGCACAAA 1440 

15 

ATGTATTAAA TGCAATAGAT AAAGACAAAT ATCATGTTGA TATCATTTAT ATTACCAATG 1500 
ATGGTGATTG GAGAAAGCAA AATAATATTA CAGCTGAAAT TAAATCTACT GATGAGCTTC ' 1560 

20 ATTTAGAAAA TGGAGAGGCG CTTGAGATTT CACAGCTATT GAAAGAAAGT AGTTCAGGAC 1620 

AACCATACGA TGCAGTATTC CCATTATTAC ATGGTCCTAA TGGTGAAGAT GGCACGATTC 1680 

. AAGGGCTTTT TGAAGTTTTG GATGTACCAT ATGTAGGAAA TGGTGTATTG TCAGCTGCAA 1740 

25 GTTCTATGGA CAAACTTGTA ATGAAACAAT TATTTGAACA TCGAGGGTTA CCACAGTTAC IS 00 

CTTATATTAG TTTCTTACGT TCTGAATATG AAAAATATGA ACATAACATT TTAAAATTAG 1860 

TAAATGATAA ATTAAATTAC CCAGTCTTTG TTAAACCTGC TAACTTAGGG TCAAGTGTAG 1920 

30 " GTATCAGTAA ATGTAATAAT GAAGCGGAAC TTAAAGAAGG T ATT AAAGAA GCATTCCAAT 1980 

TTGACCGTAA GCTTGTTATA GAACAAGGCG TTAACGCACG TGAAATTGAA GTAGCAGTTT 204 0 

TAGGAAATGA CTATCCTGAA GCGACATGGC CAGGTGAAGT CGTAAAAGAT GTCGCGTTTT 2100 

35 

' " ACGATTACAA ATCAAAATAT AAAGATGGTA AGGTTCAATT ACAAATTCCA' GCTGACTTAG ' 2160 

ACGAAGATGT TCAATTAACG CTTAGAAATA TGGCATTAGA GGCATTCAAA GCGACAGATT 2220 

GTTCTGGTTT AGTCCGTGCT GATTTCTTTG TAACAGAAGA CAACCAAATA TATATTAATG 2280 

40 

AAACAAATGC AATGCCTGGA TTTACGGCTT TCAGTATGTA TCCAAAGTTA TGGGAAAATA 2 34 0 

TGGGCTTATC TTATCCAGAA TTGATTACAA AACTTATCGA GCTTGCTAAA GAACGTCACC " 2400 
AGGATAAACA GAAAAATAAA TACAAAATTG ACTAACTGAG GTTGTTATTA TGATTAATGT ' 2460 

45 

TACATTAAAG CAAATTCAAT CATGGATTCC TTGTGAAATT GAAGATCAAT TTTTAAATCA 2520 

AGAGATAAAT GGAGTCACAA TTGATTCACG AGCAATTTCT AAAAATATGT TATTTATAGC 2580 

50 ATTTAAAGGT GAAAATGTTG ACGGT CATCG CTTTGTCTCT AAAGCATTAC AAGATGGTGC 264 0 

TGGGGCTGCT TTTTATCAAA GAGGGACACC TATAGATGAA AATGTAAGCG GG C CT ATT AT 2700 
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AAACCCTAAA GTAATTGCCG TCACAGGGTC TAATGGTAAA ACAACGACTA AAGATATGAT 2820 

TGAAAGTGTA TTGCATACCG AATTTAAAGT TAAGAAAACG CAAGGTAATT ACAATAATGA 2880 

5 AATTGGTTTA CCTTTAACTA TTTTGGAATT AGAtAATGAT ACTGAAATAT CAATATTGGA 2940 

GATGGGGATG TCAGGTTTCC ATGAAATTGA ATTTCTGTCA AACCTCGCTC AACCAGATAT 3000 

TGCAGTTATA AGTAATATTG GTGAGTCACA TATGCAAGAT TTAGGTTCGC GCGAGGGGAT 3060 

10 

TGCTAAAGCT AAATCTGAAA TTACAATAGG TCTAAAAGAT AATGGTACGT TTATATATGA 3120 

TGGCGATGAA CCATTATTGA AACCACATGT TAAAGAAGTT GAAAATGCAA AATGTATTAG 3180 

■ TATTGGTGTT GCTACTGATA ATGCATTAGT TTGTTCTGTT GATGATAGAG ATACTACAGG 3240 

TATTTCATTT ACGATTAATA ATAAAGAACA TTACGATCTG CCAATATTAG GAAAGCATAA 3300 

TATGAAAAAT GCGACGATTG CCATTGCGGT TGGTCATGAA TTAGGTTTGA CATATAACAC 3360 

20 AATCTATCAA AATTTAAAAA ATGTCAGCTT AACTGGTATG CGTATGGAAC AACATACATT 3420 

AGAAAATGAT ATTACTGTGA TAAATGATGC CTATAATGCA AGTCCTACAA GTATGAGAGC 34 80 

. AGCTATTGAT ACACTGAGTA CTTTGACAGG GCGTCGCATT CTAATTTTAG GAGATGTTTT 3540 

25 AGAATTAGGT GAAAATAGCA AAGAAATGCA TATCGGTGTA GGTAATTATT TAGAAGAAAA 3600 

GCATATAGAT GTGTTGTATA CGTTTGGTAA TGAAGCGAAG TATATTTATG ATTCGGGCCA 3660 

GCAACATGTC GAAAAAGCAC AACACTTCAA TTCTAAAGAC GATATGATAG AAGTTTTAAT 3720 

30 AAACGATTTA AAAGCGCATG ACCGTGTATT AGTTAAAGGA TCACGTGGTA TGAAATTAGA 3780 

AGAAGTGGTA AATGCTTTAA TTTCATAGAG ATTAGTCGAG GGACCTTTTA CTTATAAAAA 384 0 

TGATTTGAAT TAATACTAAA AGATTACAAA GAAGAGGTGG TTTTGTGTGT AAATACAAAA 3 900 

35 

TTGCCTTTTT CTTTTTATGT TAAATCTATA AATTTGAAAC TAAATCAAGG ■ TTAATTCTAT 3960 

GTAGACACTT TATATAGGAA GTAGTTTGAA TGTTTATATA ATGTTTTACA AAAAGATGTA 4020 

GTATTATAAT GTCTAATTTC ACATGTGTTT CAGTAAAATT TGTTGTGGAA TGTTAACGAT 408 0 

40 

ATACGTATTT TATAAAAaAT TTTTTATAAT GATTATTCGA ATGATGCGTA ACGCTTACAT 414 0 

CTTATCTAAT GCTAGCTTTT TGACAAAAAT ATGACAATCA ATTAATGTGA TTCTAATAAA 4200 

4S TATTCGCAAA TTGCTTTATT GCGATTAAAT TTTTTTGGTG GTACTATATA GAAGTTGATG 4260 

AAATATTAAT GAACTTATAT G CAAAAGT AT ATTGAGAAAT AAACAGGTAA AAAGGAGAAT 4320 

TATTTTGCAA AATTTTAAAG AACTAGGGAT TTCGGATAAT ACGGTTCAGT CACTTGAATC 4 3 80 

SO AATGGGATTT AAAGAG CCGA CACCTATCCA AAAAGACAGT ATCCCTTATG CGTTACAAGG 444 0 

AATTGATATC CTTGGGCAAG CTCAAACCGG TACAGGTAAA ACAGGAGCAT TCGGTATTCC 4 500 
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AGAATTGGCA ATGCAGGTAG CTGAACAATT AAGAGAATTT AG CCGTGGAC AAGGTGTCCA 4620 

AGTTGTTACT GTATTCGGTG GTATGCCTAT CGAACGCCAA ATTAAAGCCT TGAAAAAAGG 4680 

5 CCCACAAATC GTAGTCGGAA CACCTGGGCG TGTTATCGAC CATTTAAATC GTCGCACATT 4740 

AAAAACGGAC GGAATTCATA CTTTGATTTT AGATGAAGCT GATGAAATGA TGAATATGGG 4800 

ATTCATCGAT GATATGAGAT TTATTATGGA TAAAATTCCA GCAGTACAAC GTCAAACAAT 4860 

10 

GTTGTTCTCA GCTACAATGC CTAAAGCAAT .CCAAGCTTTA GTACAACAAT TTATGAAATC 4920 

ACCAAAAATC ATTAAGACAA TGAATAATG A AATGTCTGAT CCACAAATCG AAGAATTCTA 4980 

TACAATTGTT AAAGAATTAG AGAAATTTGA TACATTTACA AATTTCCTAG ATGTTCATCA 5040 

15 

ACCTGAATTA GCAATCGTAT TCGGACGTAC AAAACGTCGT GTTGATGAAT TAACAAGTGC 5100 

TTTGATTTCT AAAGGATATA AAGCTGAAGG TTTACATGGT GATATTACAC AAGCGAAACg 5160 

20 TTtAGAAGTA TTanAGAAAT TTAAAAATGA CCAAATTAAT ATTTTAGTCG CTACTGATGT 5220 

AGCAGCaAGA GGACTAGATA TTTCTGGTGT GAGTCATGTT TATAACTTTG ATATACCTCA 5280 

AGATACTGAA AGCTATACAC ACCGTATTGG TCGTACGGGT CGTGCTGGTA AAGAAGGTAT 5340 

25 CGCTGTAACG TTTGTTAATC CAATCGAAAT GGATTATATC AGACAAATTG AAGATGCAAA 54 00 

CGGTAGAAAA ATGAGTGCAy TcGTCCACCA CATCGTAAAG AAGTACTTCA AGCACGTGAA 54 60 

GATGACATCA AAGAAAAAGT TGAAAACTGG ATGTCTAAAG AGTCAGAATC ACGCTTGAAA 5520 

30 . CGCATTTCTA CAGAGTTGTT AAATGAATAT AACGATGTTG" ATTTAGTTGC ' TGCACTTTTA 5580 

CAAGAGTTAG TAGAAGGAAA CGATGAAGTT GAAGTTCAAT TAACTTTTGA AAAACCATTA 5640 

- TCTCGCAAAG GCCGTAACGG TAAACCAAGT GGTTCTCGTA ACAGAAATAG TAAGCGTGGT 57 00 

35 

AATCCTAAAT TTGACAGTAA GAGTAAACGT TCAAAAGGAT ACTCAAGTAA GAAGAAAAGT 5760 

ACAAAAAAAT TCGAC CGTAA AGAGAAGAGC AGCGGTGGAA GCAGACCTAT GAAAGGTCGC 5820 

ACATTTGCTG ACCATCAAAA ATAATTTATA GATTAAGAGC TTAAAGATGT AATGTCTTGA 58 BO 

40 

GCTCTTTTTT GTTTTCAATA ATTGATTCTC TGTAGATATC aAAGTaCTAA CGTTTTAAAG 5940 

GTTAAATATT TAATTGGATT GAGATCTGTA TGCGGTTATA TCaTTCTGTG TAAATATGGT 6000 

45 TCTCCACCAA ATGTGGTGAG TATATAATTT AAAGAACTAT TTTTAAATTA AGAATAATCG 6060 

AACATAAATA AACTTTATGA AATTTCAGTA TCATGTTCTT ATAAAAAACA ATAGGGCTTT 6120 

TTGctGACGC TAGTGCGCGA TAAATAATAA GTTGAATATA AAAAAGATCA CTGCCAATCA 6180 

50 TTCGTTTAAT GGCAGCGATC TTTTTTATTT AATTATTTCT CTTTCCACTG CAACATTTGA 624 0 

TAACCAATGG GTGGATGTGT TTTAATAATA TCTTTTGCGT CCTCATGACA TTGTGAAAGT 63 00 
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CCATATATTC GTTTTAATAT CATCTCATAA GTGAGTACTT TTCCTTTATG ATTTGACAAT €420 

AGTTCTAACA AGCTAAATTC ATTTGGCGTC AAATGTACCT CCTGATTATT AATAACAACA 64 80 

5 

GATTTGGAGC CAAAGTCGAT GCTTAGCAAA CCGTTAGTAA ATACAATGTT AGTTTCTTGA 6540 

TGTGACTTAG CGATTCTCTC GATGACTCGT ATTCGTG CCC GAAGCTCATC AACATTAAAA 6600 

GGTTTAGTCA TATAGTCATT CGCACCGTTA TCTAAAGCTT GAATAATTGT TTGTTCTTCT 6660 

10 

TGTCTTGCAC TTATTACAAT GATAGGAATG TCAGTATGTT GCCTGATTTC TGAAATCAAA 6720 

CATAATCCAT CTTTATCTGG TAAACCTAAA TCTAATAAAA TGACATCTGG TTTATCAATT 6780 

JS TGAATTTTAA AGTGTGCTTG TGTGGCATTG TCGGGTGTAG TTACATTGTA ATAATCTAAA 6840 

GTTAATGCAA CATCAAGTAA ATGTGTGATT GCGTGATCAT CTTCAATTAT CAATATTTTA 6900 

GATTGCATTA TACGTCTCCT TCGTTAAAGT CTGTATATAT ATTGAAATAG AATATACTGC 6960 

20 CGTGTGGTTG GTTCGGTTTA TATTGTAAGT TTGATTGATG TTTGTGTAGG AT AG TCTGTA 7020 

CTAAATATAA GCCTAGTCCC ATGCTTTCTT TTTGGTTATC TTTAAAATAT TTATTTGATC 7080 

CTGTGTAAAA AGGCTCGAAT ATCTTTTGTt GTTCTTCTAA ACTAATTCCA GOTCCTTCGT 714 0 

25 CTATAACGGC AAATTCGATT TGTTCATAGC TAG CATAACG AATAGATAAA TTGATTTTGG 72 00 

TGTCAGTAGA AGTGTGTTTA ACTGCATTTT CAATCAAATT GAAtAAAgCT TGTAAAATCA 7260 

ACTTACTGTC AATGTGTATA AAC t GTAAAT TTACTGAGGA TG AT A CAGTT ATACGCTTTT 7320 

30 TTAAATGGCG ACGTTCTAAA ATACATATCG ATTTCTTATA CTA 7363 

(2) INFORMATION FOR SEQ ID NO: 20: 

/(I) 'SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 10470 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
~ (D) TOPOLOGY: linear 

40 

Ui) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 



TTAACAATCG ATAACCACAA TACTTCTATT GTAATTGTTT AACGATTTCn CGATTAAAAT 60 

CATCTAAATC GTCTGGTACT CGACTTGTTA CAATATTGTT GTCTACAcTa CTGACTGATC 120 

AACTACATGT GCGCCTGCAT TTGATAAATC TTTGCGTACA TTTAATACTG CTGTTAACGT 18 0 

ACGACCTTTT AAAT CGTCTG TATCTATTAG TATTTGTGGC CCATGACAAA TGG CAAATGT 24 0 

TGGTACATCA TTTTTAGTAA AGTATTTAGC AAATGTGCCA TATCGACCTT CTGTATCTCC 300 

ACGTAAATGA TCTGGTGAAA ATCCTCCAGG AATTAATAAT GCATCATAAT CTTCTGGTTT 360 
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10 



15 



ATTTGCAGTA TCTCCAATCA CTACAGTATT AAAGCCTGCA TTCTCTAATG CCTCTTTAGG 4 80 

GCTTGAATAT TCTATATCTT CAAATTCGTT TGCTAGAATA ATTGCTACTT TTTTAGTCAT 54 0 

TGAAAATCAC CTTTCTATAT ATCATTGATA TAATTACTAT AGACAAGTAA ATCAGTGATT 6 00 

AAACATACAA GATATAAAAA ATATTAAGCG ACTGTCGCGA TATCTAACCC TAACACATCT 660 
TATGTGGCAT TTACTTAGAT ACTAATTTAA CCTTTTCTTC AAG CTGATCT AACAATCCAA 720 
TCCATTCATC TATATCTTCA ACACGTACTT CATCAGGATT TACATGATCG ATATCCTCAA 780 
TAAACTTATT TAAACGCGCT TTTATCTGTT CGATTGTTTG CTGTTCATTC ATAAAAAGTT 84 0 

AACTCCTTTT ATTTTGTTTT CTTTTTCATT ATTATCCTAA CAGAAATTGC GTTAAAGCGA . 900 
TATAATCTTA GCTATATTTA TGACATTCAA ATTATTTTGA CTTTTAAAAA TCCCCTTTTC 960 

AATTAACTAA AATTAAGAGA TAATTTGTTA CGAGTGATAA TACGAaGkGG TaTCATACCG 1020 
20 ATATGAACCA AATAGAAAGA AGGAAGTTTA AGACGATGAA TAGCGTCAAA TTGAAGCAAC . 1080 

CTGTTAGCAT TTACAATGAT CCATGGGAAG TGAAATTTAT ATACATTTAA ATTTCATGAG 1140 
. ACAATAAACG TTGATTTAAT GCGTTTTTTT GCCTTTTTTA TTTTCCTTAT TTTTTCTGTT ; 1200 

26 TTACAACAAA ATGGTATCAA AAATGGTATC ATTTGTAGTT ATTTTAGCTT CACATATTAA 1260 

AACAACCACA CTCCTAAATT AATAGGTGGT GTGGTTTTGT TGGTTGTGTG GGGATAAAAA 1320 

. TAACCGCATC AGTTAAGATG CGGTTATCTA GCAAGGGCCA CGTATTTATA AATACGTTTA 1380 
30 GAATCTCTTC GGCAACTTTG CTATAGACAG TCTATGCTGT TACTAAATTA TACCACCACA < ■ 1440 

CAAACCTACT CCCATTCAGG AACACAGAGC TTTGTCGCTC GTCAGCAACG T CAT ATGAAT 1500 

TCTCAGTTCA TGTTGTGGTG ACACTTTAAA CGGTCTGTGC CAGTAGCGAC CGAGTCATTT 1560 

35 

CAAGAATGAC CATTTCACAT TTATATTATA ACACTTGTCG TGCGTAACTG TATAGTTTTT 1620 

CAGfTGTATT TAAAGTTAAG TTATCTACTT CGCGCTTTCC TTGCCTTAAT TGTGAAATTA 16 80 

CATATTGCGC TACGCCAGTT TGTTTGTGAA TTTGGTAACC TGTTATATCA CTTTTGATCA 1740 

ATTCAATTAT TTTTAATTTA TAATCACTCA TATTATCTAC GTCCATTCTT TTTATCTAAA 1800 

CAATAAAAAT GTGTCTTTCT CCCGATAAAT AATAACAATG GTAGGCTTAA TAAAAACAAT 1860 

ATTAAATACA TTTGTTCTGT CATAATTGAA AACCTCCAAA TAATATTATA TTATATAAGT 1920 

GTAAGGAGGA GCCATCAGGC TCCAAGCATA ATGTTAATCT TTGTTGTTTG GCTTTCGGTC 1980 

TAGGTAGCCG AGATGCCaTT CTCTAAGTTG TTTTAACACT TCTGGAATTA TCAGTACTGC 204 0 

CAATACTTGA TGTTCTAGAA GTGTTTTTAT TATGTCTAGC ATGAGGCTTT TCACCTCCTT 2100 

ACACATAATT TGTAAGTCAT CAACTAACCT ACAAATATAA TTATACTAAA CAAATGTTTA 2160 



40 
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GTTATCTACA TTTAAATCTT GAGAGAAATG TTAAAAAGTT CTAGTAAAAT AAT AG CACAT 2280 

TTTATCTTTA AATGTAAATA GAAAGCAGGT ATGTAACGCA CCTGCTTAAA TAGaCATGAC 2340 

5 

TATGTCATTC TAACTGATTT CTCCCCATAA GTCACCTAAT ATCTGATTAG GTGGGGCAGA 24 00 

ACCATTCCAT GTTCTAATAG GCAAGTAATA ACGTTGCCCC TCCCATGTAT ATCCTACCCA 2460 

AACATGACCA TCTTGTAACA TCACTTCTGT AT AAT CACAA TACCCACCAG GTTGGAACTG 2520 

10 

ATAACCCACT GGACAAGATA AGAATGGCCC CACTTTTCTT ACTGTGATTG GTTGATTGCC 2580 

GTTTGTGAAT CTAGCACTTT CTTCCATGTA GTAAGTACCA TATTTATTAC GTTTCCATGC 2640 

. ACTTGCAACT GGTTTAACTG TATTACTTGA AGCGCTTGAC TCATTAGAGA CAGTGGCAAC 2700 

CGGTATTTTA CCATCCATGT ACGCCCTAAT CTGCTTGATA AAGTAGTCTT TAAGTTGCAA 2760 

CCGCTTGTCT TCTGGCAATA GACCGCGAGT TACTGGGTCA AAACCAGTGT GTAAAACCGA 2B20 

20 ACTTCTATGA GGGCATGATG TTGAAGTAAA TTCATTGTGC AATCTGATTG TATTTCTGTT 2B80 

TGCTGGTAAT CCCCATTTTT TCAACAATCT AGCGCATTCT TGGAAAGTTG CCTGTTCATT 2940 

TTTTAAGAAT GTCGCGTTAT CTGCGCCCAT TGATTGACAT ACTTCAATAC CGTAATAATA 3000 

25 TTTATTACCT ATTTGATTAG CGGTATGCCA ACCTACTTGT GATTCATCTA AGGCTTGCCA 3 060 

AAGTGTGTTG GCTGATACGT AACTATG CGC AATGCCCGCT TCTAATCTTG ATAAAGGTGC 3120 

ATTTACTAAT C CGTT ACGAT ATGGTTCAGC AGTCGGCCCT TTGCTCCCTG CGTCGTTGTG 3180 

30 TATAACTATA CCTTTAGGGT TACTACCACG CTTAGGTAGG TCATAACCTT TAACCACATC 3240 

TTTGATGATT TTAAGTTCTA CTGCTTTAGG TTGTGGCTTA GCTGTTTCTT TTTTAGGTGC 3300 

TTGTGTAGGA GATTGAACTG ATCGTGGCGC TGTCTCACTT TTAAAATTCG GACGGATAAA 336 0 

35 

CCACATAGGG AAATCATAAG CATGTTGTCG TCTTGTAACT TTTTCCCAAC CCCAGCCGGG ' 3420 

TT G T TC GATT CCGTGAGTCC AGCCACCGCC TAGCCAATTC TGCTCATATA CAATGATGTA 34 80 

ATCTAAAGTT GCTTCAATTA CCCATGCAAC GTGACCATAT CCAGCACCGT AGTTGCTACC 3540 

40 

GAATACCACC ATGTCGCCAG GTTGTGCTAA GAAGTCCGGT GTATTTTGGT ATACAGTAGC 3600 

TAATCCGTCG AAGTTGTTAG CGAACGGAAT ATCTTTTGCA CCTAAACCTT TTAGAAGTAA 3660 

4S TCCAAACAAA ACTTTCCAAC CAGCATTGGC ATAATCAAAG CATTGAAATC CATACCATAA 3720 

GTCCACATTG AATTGTTTTC CCTCAGAAGT TTTCAACCAC TCTATAAACT CATTTTTAGT 3780 

TAATTTTGCT TGCATTGTCG CCACCTCCAT GATGATACTC ATTCACATCA AAG CCAACAT 3840 

SO CGTTAGAGGC GTCTGTGAAA GGTTGTGATG TATCATATTC TTTTGGTGcT TTCGCGCTTA 3900 

ATTCCGGCGT TAAACTACTG TCTTGTGATG ATTTCCACGT AACTTGTTGT TCTTCTTTTT 3960 
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TTGGGTCAGT AATAACGCCA ATACCTGTAA GTAACGTGAG GATAGCGCCT ATAATTGCGC 4080 

TAGCTTGATT TAATTGAGTA GATAAATCTA ATCCGAATAA ATCCGTGACT TGCTTGATAA 4140 

5 AT AG CAACAA TGCTCCAACT AAACCAGTTA GTACTGCTTT GTTTTTGAAT CTCAATTTCC 4200 

AGTTAATATC CATTTGTTTG CTCCTTTTAT CCAAAATAAA AAAACGACTA AAAATTAGTC 4260 

GTTTAAAATT ATTCAATGGT CAATGTCGGA GATCCTGAAT AAACATCACT TATAGTGACG 4320 

10 

TACAACATCC CTGAAGGATT ACTAAAGTTG ATATTTTTAC TTGCAACTCC GCTATTGACT 4380 

CCTGATATTC CTAAATCACT TGACCCTAAA TTAGTTTGCG AAATCCTCAT TATACCGCTA 444 0 

CGTACATTTT CTATTGTCAC CTGATAACTT TTATTGGGTT CAACTCCATT TATTGTCCAT ,4500 

IS 

TTTGCTGTTG ATTCTTCTAT GCTATCCGGA TATTTATTTT TAGGTAAGGG TTTTATTACA 4560 

AAAGATGAAG GCTTTTTCCA TACTTGGATA TTTCCAGCAT ATACTTTTGT ATATTCTTCA 4620 

20 CCTTCGTAAA TAAACTTCTT TACATTTTTA AAATTACCTT CCATAAAAAT CAC CCTTTAA 4680 

TTAAATATAA CGTATTCGGG TCTTTTTGAT ATATATAGTT ATATTCATTT TCTGTTCCTG 4740 

TCCAAATTTT AACCGTCGGT TGAGATGCGC TTTTTAGTTG ATATAAATTA TCCGCTTGTT 4800 

25 GTTTAGTAAA AGCTTGAGAT GACAAAACAT ACCGCTCGTC ATGATTATGA TTTTTTGGAG 4 860 

CATATAAATC ATTTAGTGTT TGTTTGAATT CCTCAAAATC TTCTGTATTA ACTTTTGAGC 4920 

CAATCTGTTG CAATACACTT TCTGAAATAG AGTTGTTTTG TATTGCTTCT GCTAATTCTC 4 980 

30 TTAATGTGTT CAT AGATT CA GGCGCGCTAT CAACTAGTTC AGCAATTTTT' GTATCCGTAT'" 5040 

ACGTTTTAGA GTCGTTGAGA GTTGTATCTT TGATTTTTTC AACTTCTTGC AATTTATTTT 5100 

CTAACCCTTC AACATTTGCG ATATTGATTT TGTCCAATAA CT CAGGTTCT GCTTTGATAT 5160 

35 

CTGTATCTTT ACCATCAATT TGCCACATTT TAGTGTCAGG ATTGATTGAT ACTACAGTAC 5220 

CGTTXTTACC GGGTGCGCCT TGTTCTCCTT TTTTACCTGC TTCACCTTTT GCTCCAGGTT 5280 

GTCCCGGTTC ACCTTTATCA CCTTTCGCAC CTTTAAATCT ACTTTCATTC TTTTCGATGT 5340 

40 

. AAGAAATGAC ATCTTTATCT ATTTTCTCTT TAAAGTCTTT GCTCAATAAA TCTGTCGCGT 5400 

TATCTTTTAA AATTCTCGTA ATAGCATCAT CTACCAATTT AACATCGATT TCTTTTGCTA ! 54 60 

CAGCAGATTC AATACCACTA TCAACGATAT TGAAAGAAAA GTTTGCGACA TGTATTTTTT 5520 

45 

CTTCTTCTTT CTCTAAAAAC AGCTTACAGC GAACATAACC AGCGTGTTTG ATAACCTTTT 5580 

TAGGTATCTT GTAGGTAAGG AAACCTTTTA CAACATCGTC GATAATAAGG GGCTCATTTT 5640 

50 TGAATATAGA GCCATCTTCC ATAAACAAAT GTAATCTAGG TGTTAAGCCA TGTGCTTTTA 5700 

GAT CGATACG ACCTTGTTTG T CATTGAT AC CTATTCTTAT AGATGCTGTA TTTTCATCTT 5760 
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CAACATCTTT TATTTTGTAC ATTTACACAC CTCTTTATTT ATATTTATCC CTTGTGAAGT 5880 

AGATACCTTT TAAGCCGATT TGTTTATATA ACTTAGCGAT TGTACTTGCT TGATGTTGGC 594 0 

ACCACTCTAt AGCAGTAGCG TATTGGTGGG TAGCTGGATT CTTAGGATTC CATCTAATTC 6000 

GGTACAATGT GTTTTGACCT TTATTGATGT AATCCTTTCT TACGAAGCTA GCACCGCCCA 6060 

TGATTGCTTT TGCTGGAGAT GTCCAACCTT TATTCCTTGC AAACGTCATT GCGTAGTTAG 6120 

GATTGTTGTC GTAAGCGCCA ATGCCGAAGT AGTTGTATAC TCCATCTTTT CCGTTAGCGA 6180 

AGTTACTTGT TCCATATCCA GTTTCTAAGA AAGCATGCGC GATTAAATAA ATTTCATTAA 6240 

r5 TGTTGTGCTT TTTACAAGCT TCTGCGAACG CTTTACCTTG ATTATTCAAT GTTCCCTTAC 6300 

CTTTAAGTAT CTTATTAAGT GCGCTAACTG AAACACCTTG ATACTTGCCT AAATTAAGCA 6360 

TTTGGTAGCA TTGTGTGTTA CTTTCC CATA TACGCTTTAC ATTCATTGCT GAACTCGTTT 6420 

20 GTGCTCGTGT AGCGTTAscC AACCCCAAGC ATTAGATTTT TTCGGGTTAC CTCTTGCCAT 6480 

TTGTTTATCC AGTGCTTGTT TGAATGTATA AGGACTCGTT TCTGTTATGA TCTGCGGTTG 6540 

TTTAGATGCC GAACCATTGT TGGCTGTTGG TGACGAGTCT CTTACATTAG CTATATCAGC 6600 

25 GTTTTTATTA TCTACCATAA CTTTTATTCT AGATTTTGTT ACTGTTGGCT TAGTTATAGA 6660 

ATTTAATAAT TTTTCTCTGT TTTTAAATAT ATTAAGTAAT GCCTTTTCTA ATCCTTCGTA 6720 

TTTATCTTTA GGAGGAACAC CGTTGTCAAT CATATTCCAA TTAACATGTT CCAACATTGA 6780 

30 , 

ACGCCAAATG CTGTCGTCTA CTTTTAAATT TTCAATACTT AGAGGTATCT CATATTTGGC 684 0 

CATCATATCT ACAGCTACAA CCATTGCGTG AATCTCATTA AAAATAAATT ' CATTTTTACT 6900 

CGCACTATAA TCTTCACATA CGTCTATAAC TATATAATCA GGTTCATTAG GAACTTCAAA 6960 

3S 

TACAGCTCTT CTAGGTGCCC AAATATTATG TCTATCAACA TAAAAGTGGG GATATTCTAC 7020 

ATCCTGTTTG TATTTCTTCC TACTGTTATA TAAACTTTCT ACCGAGCTCA TCGTTTGTGC 7080 

GTTTCTAATC ATTATTCCTT TAGGTTTTTC GAGTCGTCGA TTACCTTCTA CTATAAAGTG 714 0 

40 

ATAAATATAT TCTGGATAAT TAACCTCTTG GCTAGAAATA GTGTACTTTA TAGTTGTTAC 7200 

ATCTTTCCAA ATTGGAACTT TTTTATTATT TTTTTCGTTA TCATCACTAT CATCTTCTGG 72(50 

4S TTTAGGTGCC GGTGTAGTTT TGTCTGGATG ATATGGTGGT CTAACAAAAT ATTTAACCCC 7320 

TCCACCTGGT CCATCATGAT AAGAGTGTTT AATTTTATAA GGTGGACTTC CTGTTGCGTT 73 80 

ATTTGTATAC CAGTTTTGAT CTACGCCATA GCAATAGTCT TTTGTGGATG GTCCCACTAC 74 4 0 

50 AATGTTTACA TGTCCTGCCC AACCACCAGT CCAAACACCC CAGTCGCCTG GTTGTGGTAC 7500 

AAAATCTTTT GTATTTCTAA TTATCTTGAA AT CT CT AC CT CTATAATTGG ATTTTTG AG C 7560 
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TAAATCCCAG CATTGTGCTC CCATTCCAGA ACCAGGTACA TCAATAGCTA TTTTGTTTTT 7680 

AGCGATATAT AACGCCCATT C3ACCACTTC ACTAGCTGTG GGCTTTCTAT TTTTCGGATT 7740 

5 AGGTAATCCC ATGTATGCAC CTCATTTCAA TCAAAATAAA AAGCCAGTGC CGAAGCACTG 7800 

ACTCTTAACT GTTATTTACA TTTACCAAAC CAGAAGCACG CCCAGAAGCT ATATCCTAAA 7860 

ATCCCTTTAA GCATGGTAAT .CACCTCCTTT AAATACCAAA AACAGTTCTT AGTAAAGCTA 7920 

10 

TGACAATCGT ACTGAAGATA GTCCCTATCA AACCTAGAAT CCACATTTTT ATGTCTCTAA 7980 

TATTCTTGGC ATTCTTTTCT TTATTCTTTT CAT CTTCTAC CTTGTCGCGC TTTAATTCTT 8040 

CAAAATTTCT ATCTAATTTG TCATAAATCT TTTCTTGCGC TCTAAGACTA TCTTCTATTC 8100 

15 

TGTCGAATTT TTCAAACATA GTCTTATCAT TTTCTTCTAA TCGCGTTAAA CGCCAATCTT 8160 

GTTCATGTCG TTTGGTAAAT CCAAACATTA TGCCACCCAC TTTATTCAAA TTAAAAAGCC 8220 

ACAAGCATTA CACCTGTGAC TTTTCATCTT TTGTTTCTGG ATATTTTTCT CCAGTGATTA 8280 

20 

AAGCGTATTC TTCTTTATCG ATTAAACCCT TGTCTACGTA CCACTTAATT TGCTCGTTTT 8340 

TATAGTAACC CCAAACATAA AAAGTTTTAA TGTCTTTAAA AGTTGGATAA ATCATCTTCA 8400 

2S TTATTTAAAC GTCCCCCTCA GTACTTGTTT TGTTAGTTTT CAGTTCAGTC AACTGTTGTG 84 60 

TTAACATAGC GTTTTGTTGA GCTAATTCCA TTGTTAATAC GTTTACTTGT GCCACCTGCA 852 0 

TTTGCATACT CGCAAGCATT CCGCGAAGTT. CCTCATCACT TAAATCTGAC GCACTTTGTT 858 0 

30 GGTTTGATGC ATTCGGTACG TCTTCTTTTT CGAAATTGCT ATTGTATTTA J ATTTCGCCGT 864 0 - 

■ ' • ■ f 

TAGTGAAAAC AAACTTTCTA GGTTCGAACT CTTCTTTAAA TTTAATAGGC ACATTGTTAT 8700 

CATCTAGATC TAAACTATTG CGTAAACCGC CAGTATTAAC GAATCCGATA ACTTCGTTTT 8760 

35 TATCGTTTAC TGTGATTTTC ATTATTTCCA CCC CATAATT TTAGTTATAG TAACTTTGTT 882 0 

GGCAJTCGCT CCAGAACCTG ATGTTTTACC TAAATCAAAG TACACATCGT TATCTATTCT 8880 

TAAAGTAGTG CTACTTGTTT TGGATAGTAA GCACTCATAA ATACCGCCAC CGTTGCCGTC 894 0 

40 

TGAGTCAACT ACATTCGCTT TACTCAATTG AATCGCGTTA GGTAATGCGG TTAGTCCGAA 9000 

TCCCTCAATA ACGCCACCTG GATAAGTTCC ACTTACCAAC AAAATAGAAT AGTTTGTGTA 9060 

CGGTTCAGTT AGATTGATTG TTGTACCTAC ACCATTTGCG CCACCGTCGA ACAATACCGT 9120 

45 

TGATTTATGT TCATTAGGAA CTGTCCACTG TTGCTCAAGT CTGCCGTTTG TGATTGATCG 9180 

TGTGTAAATC TTTTTAGAGT TATAAGGTGT GAAGTTAAAT AGCTTGTTTG TATCATCTTT 924 0 

5Q AACGAATACC GATAAATAAC CCTCATAACT TTCAACGCTA CCTGGTAAAT CCGGCAGTCT 9300 

TGTTGCATAG TAATTACCAG CAGTTAAATA TCCCAAATCG CCTTGCGCAT TATTTAAGTT 9360 
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GAATTTATCA TCTACATACT GCTTAGCTTG ATTTAAAGCG TTGTTAGACG TTTCTTCAAC 948 0 

AAATTGCTTA GTTAAGTTTC CATCATTCTT TTTATAAAAC GGGTACCATG TGCCGTAGAT 954 0 

5 

TTTGTATTTT GTGTACTCAT CGTTTGAATC GTCTGGGTAC CATGTTGCAC GAGCAGTATT 9600 

ATTATCAACA AGATAAACAA CTAACACACC AGATTTGCTT GATGTATAAG TTGATTCATC 9660 

GAACGAAGAA CCGTCATCAA CACCATCTTG TCCAGGCTTC TCTAACGTGC CTATATCCGT 9720 

10 

CTTTTCTGGC GCATCTGTTG CATTAGTAAT ATGAATAATC CTAGATGTGT TAACTGCGCT 9780 

TAAAACGCTA TCTATGGACT GCTCATACGA TTCAATTGCT TTACCGTAAT CATCTGTAAG 984 0 

TTTAGACTTT TGCCAATTCG TTGTTGAATT ACCTTTAACA AGGTCAGCGC CATTGATTTG 9900 

TTGTTCAACT TCGTTAACAC GTTCAAAAAT CGCTTGCTCT TTTTCAACTA TTTTATCGAA 9960 

TTCAGCTGTA ACAGCTTGTG TTGCACTAGT TTGCGtCGGA GTAATAGCTT GTATAGCTTC 10020 

20 GTTTTGCTTG ATTTCGATTT GTTGAATGCC TTTTGTCGCA CTATCATTCA CTTTTGCTAT 10080 

TAACGTTTGT GTATCAGCCA TATTTTGCTT TAATTGGTTA AAATCTTTAC CGACAGCTTC 10140 

GATAGTATCT TGAATAGATT TGATATAAAC AAGCTTTGTT ATACCATCAA ACCCACTAAC 10200 

25 TAAATCATTT TCAATATTGA AGCTAAATTG ACGTTCAACA ACAACATTAT TACTC CCGTT 10260 

TTGTGTAAAG AATGCCTGAG CATGCACCTT GCCTGAATGT TTTAAAAATT CATTCGGTAT 103 2 0 

CACATACTGC AAACGCCCAT TAATTGCGTC TACTATCGTT AATTCGTCTG AAATATAAGC 10380 

30 GCCTCTATCT ACGTTATAAT CATCGGTTTT TAAnACGATA GATGTTTTAA CATGTTCAGA 10440 

ACTTATAGAT AAGGGT CTGT TATnCTTAGT 10470 
(2) INFORMATION FOR SEQ ID NO: 21: 

35 

-UK SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3647 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

40 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

ATCAGATCTT GAGAATCGAG TTATTAAGTC TATCGAAGAC TTAACTAAAA TCCAACCATT 60 

45 

CATGCCTACA CAAGATTTTG ATTTTAAAAC TAAAGAAATT CAATCAAACA TTTCTGAAGA 120 

AAGATTTATC GAAATGATTC AGTATTT CAA AGAGAAAATA ACAGAAGGGG ATATGTTCCA 180 

so AGTTGTGCCA TCAAGAATTT ACAAATATGC G cATCATGCT AGTCAGCATT TAAATCAACT 240 

TTCGTTTCAA CTGTATCAAA ATTTAAAACG ACAAAACCCA AGTC CATATA TGTATTATCT 300 
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TCAAATTGTA ACAACTAATC CTATTGCAGG TACGATTCAA CGTGGTGAGA CGACACAAAT 420 

AGATAATGAG AATATGAAAC AACTACTTAA TGATCCAAAA GAATGCAGCG AACAT CGTAT 4 80 

5 GCTAGTTGAT . TTAGGACGTA ATGATATTCA TAGAGTAAGT AAAATCGGTA CCTCAAAAAT • 54 0 

TACTAAATTA ATGGTTATTG AAAAATATGA ACATGTTATG CATATCGTAA GTGAAGTCAC 600 

AGGTAAAATA AATCAAAATT TATCGCCAAT GACAGTTATT GCGAATTTAT TACCAACAGG 660 

10 

TACCGTTTC A GGTGCACCAA AATTACGTGC AATTGAAAGA ATATATGAAC AATATCCACA 720 

TAAACGGGGC GTTTATAGTG GTGGTGTTGG ATACATAAAT TGTAATCATA ACTTAGATTT 780 

TGCATTAGCA ATTCGAACGA TGATGATAGA TGAGCAGTAT ATCAACGTAG AAGCTGGTTG . 840 

1S 

TGGCGTTGTA TATGATTCTA TTCCTGAAAA AGAACTGAAT GAAACGAAAT TGAAAGCTAA 900 

AAGCTTATTG GAGGTGAGCC CATGATCTTA GTTGTAGATA ATTATGATTC CTTTACATAT 560 

2Q AACCTAGTGG ATATTGTTGC TCAACATACT GACGTCATTG TTCAATACCC TGATGATGAT 1020 

AATGTGCTGA ATCAATCGGT GGACGCTGTT ATTATATCTC CTGGTCCAGG GCATCCATTA 1080 

GACGATCAAC AGTTAATGAA AATCATATCA ACCTATCAAC ACAAACCCAT TTTAGGTATT 114 0 

25 TGTTTAGGGG CTCAGGCACT GACTTGTTAC TACGGTGGAG AAGTCATTAA AGGCGACAAG 1200 

GTTATGCACG GCAAAGTTGA TACACTAAAG GTTATATCGC AT CATCAACA TCTGTTATAT 1260 

CAAGATATAC CAGAACAGTT TTCAATTATG AGATATCATT CATTAATAAG TAACCCTGAC * 132 0 

20 AATTTTCCAG AAGAATTGAA AATTACTGGA CGTACCAAAG ATTGTATACA GTCATTCGAG 13 8 0 

CATAAAGAAA GACCGCATTA TGGTATTCAG TACCATC CTG AATCATTTGC TACAGACTAT 144 0 

GGTGTCAAAA TAATTACAAA TTTCATTAAT CTAGTGAAGG AAGGATGAAA ACCATGACAT 150 O 

35 

TACTAACAAG AATAAAAACT GAAACTATAT TACTTGAAAG CG AC ATT AAA GAGCTAATCG - 1560 

ATATACTTAT TTCTCCTAGT ATTGGAACTG ATATTAAATA TGAATTACTT AGTTCCTATT 162 0 

CGGAGCGAGA AATCCAACAA CAAGAATTAA CATATATTGT ACGTAGCTTA ATT AATACAA 16 8 0 

40 

TGTATCCACA TCAACCATGT TATGAAGGGG CTATGTGTGT GTGCGGCACA GGTGGTGACA 174 0 

AGTCAAATAG TTTCAACATT TCAACGACTG TTGCTTTTGT TGTAGCAAGT GCTGGcGTAA 1800 

AAGTTATAAA ACATGGtAAT AAAAGTATTA CCTCaAATTC aGGTAGTACG GATTTGtTAA 1860 

45 

ATCAAATGAA CATACAAaCA ACAACTGTTG ATGATACACC TAACCAATTA AATGAnAAAG 1920 

ACCTTGTATT CATTGGTGCA aCTGAATCAT ATCCAATCAT GAAGTATATG CAACCAGTTA 1980 

so GAAAAATGAT TGGAAAGCCT ACAATATTAA ACCTTGTGGG TCCATTAATT AATCCATATC 2040 

ACTTAACGTA TCAAATGGTA GGCGTCTTTG ATCCTACAAA GTTAAAGTTA GTTGCTAAAA 2100 
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AAGCAACACT ATCTGGTGAT AATTTGATAT ATGAATTGAC TGAAGATGGA GAAATCAAAA 2220 

ATTACACATT AAATGCGACT GATTATGGTT TGAAACATGC GCCGAATAGT GATTTTAAAG 2280 

5 GCGGTTCACC TGAAGAAAAT TTAGCAATCT CCCTTAATAT CTTGAATGGT AAAGATCAGT 234 0 

CAAGTCGACG TGATGTTGTC TTACTAAATG CGGGTTTAAG CCTTTATGTT GCAGAGAAAr 2400 

TGGATACCAT CGCAGAAGGC ATAGAACTTG CAACTACATT GATTGATAAT GGTGAAGCAT 24 60 

10 

TGGAAAAATA CCATCAAATG AGAGGTGAAT AATATGACGA TTTTATCAGA AATTGTTAAA 2520 

TATAAACAGT CACTTTTACA AAATGGCTAT TATCAAGACA AACTTAATAC CTTGAAAAGT 2580 

GTGAAGATTC AGAATAAAAA ATCTTTTATA AACGCAATTG AGAAAGAACC AAAGCTAGCA 264 0 

15 

ATTATTGCAG AAATTAAATC GAAGAGTCCT ACAGTTAATG ACTTACCTGA ACGAGATTTA 2700 

TCGCAACAAA TCTCAGATTA TGACCAATAT GGTGCAAATG CCGTGTCCAT TTTAACTGAT 2760 

GAAAAGTACT TTGGTGG TAG TTTTGAAAGA TTACAAGCAT TGACGACAAA AACAACATTA 2 820 

20 

GCCGTATTAT GCAAAGACTT TATTATAGAC CCGCTTCAAA TTGATGTTGC TAAACAAGCT 2B80 

GGTGCATCTA TGATTTTATT GATCGTTAAC ATCTTATCTG ATAAACAATT GAAAGATTTA 2 94 0 

25 TATAACTACG CTATATCGCA AAATCTAGAA GTGTTAGTTG AAGTACATGA TCG CCATGAA 3000 

TTAGAACGTG CCTATAAGGT TAATGCTAAA TTGATTGGTG TAAATAACAG GGACTTAAAA 3 060 

GGATTTGTTA CAAATGTGGA ACATACAAAT ACTATTTTAG AAAATAAAAA AACAAATCAT 3120 

30 TATTATATTT CTGAAAGTGG TATTCACGAT GCATCTGATG TAAGAAAAAT CTTGGATAGT 3180 

GGTATCGATG GCTTACTAAT AGGTGAGGCG CTTATG CGTT GTGACAATGT AT CTG AATTT 324 0 

TTACCACAAC TGAAAATGCA AAAGGTGAAG TCATGATGAA ATTGAAATTT TGTGGCTTTA 3300 

35 ' CATCAATAAA GGATGTTACA GCGGCCAGTC AATTACCTAT TGATGCGATA . GGTTTCATCC 3360 

ATTATGAAAA AAGTAAAAGG CATCAAACAA TTACCCAAAT AAAAAAGTTA GCGTGTGCTG 3420 

TTCCAAATCA TATCGATAAA GTATGTGTCA TGGTAAATCC TGATTTAACA AC^TTGAAC 34 80 

40 

ACGTATTAAG CAATACGTCA ATTAACACAA TACAGTTACA CgGCACAGAA TCTATTGATT 3540 

TTATACAGGA AATTAAAAAG AAATATTCAA GCATTAAAAT CACTAAAGCT TTAGCTGCaG 3600 

ATGgAAAACm TwATCCCAAA caTtAAtnAA tnTTAgGGGG TCCGTGG 3647 

45 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5966 base pairs 

(B) TYPE: nucleic acid 
<C) StRANDEDNESS : double 
(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID* NO: 22: 

CcAcCTTGAC CACCTTTACG TGGAATCTTT TCmCCTkGAG CAACaTCGaT AATaTATATT 60 

GAAAgTCAAC AAGTTCTGGA CTAAATGTTG* CTGCTAAGTT ATCGCCACCA GATTCTATGA 120 

AAATTAGTTC TATATCGTCA TGACGTTCTA ATAATTCGTC TATTGCTGCA AAGTTCATAG 180 

ATGCATCTTC ACGAATCGCA GTATGAGGAC ATCCACCAGT TTCAACACCA ATGATACGAC 240 

TTTCAGGTAG AACTCCTGAA TTTACTAATA TCTTTTCGTC TTCTTTTGTA TATATATCAT 300. 

TTGTAATAAC GCCGATACTC ATTTCTTTTG AAAGACGTTT TACAACTTTT TCAATTAATT 360 

GTGTTTTACC TGCACCTACA GGACCACCAA TACCAATTTT AATCGGATTT GCCACAATTA 420 

TAACCTCCTA TGATATGAAA t TCTAACATT GaCGTTCTCA TGCGCCATTT GATTTAGTTC 480 

TAAACCAGGC GCTGTCATGC CAAAATCTGC TTCTTTTAAT TCGAAAATCT GCTTTCTTGT 540 

20 TCCTTCTATA TAAGGAATCA TGTGAGTAAC TATCTTTTGA CCAGCAGTTT GTCCAAGTGG 600 

AATAGCACGA ACAGCATTTT GAGTTAAACT TGAAACATTT TGATATAAAT AGTAATCAAT 660 

AATCGTTTCA ATATCTACAC CTAAATGATG GCCTAGCATA GTAAAACAAA TAGCTGGATT 720 

TnACTTTGCT TTCTTATCTT GCATTTGTTG ATGATACCAA GCAATCCATG GGCTATtATA 780 

AAGTTCTAAA GCCAATTTAA CCATGCGAGT CCCCATTTGT - kTTGCACCAA CACGTGTTTC - 84 0 

fTTTAGGTAAG TTTTGrACAr ACATCAGTTT ATCTATGTGT AATACTTTTT GTGTATCATC 900 

ATTTTCCAAT GCAT CATAAA CT AaACG CAT GGCTAAACCA TCAGAATAGG TAAGTTGCTC 960 

.TTGTAAAAAC ATTTTTAACC AAGCAATAAA AGTATGATCG TCATGAATTA TATTTCGTTG 1020 

35 -AATATATGTT TCAAGACCAA ATGAATGACT GAAAGCACCT GTTGGAAACT GTGAATCACA 1080 

GAACTGAAAT AATCTTAAGT GTGTATGATC AATCATGAGA ATGCCCTATA TGTCTGAAAG 1140 

CCTTATTAAC TTTACGGTCT TCTCGAACAT ATGGGATGCC TAAACTTTTT AATAAATCTT 1200 

40 CAACTAAATA ATCATATTGT ACTAGCATTT CAGTCTCTGT AAATTGTGCT GGCAAATGAC 1260 

GATTTCCTAA TTGATGGGCT ATATCTCCCA TTTCTTGCAA TGTTCTTGGT TGAATCACTA 1320 

AAAGATCTTC TGAATTAACA TCCACAATAA TCATATTATG GTCATCTGCG TATAAAATAT 1380 

45 

CTCCATATTG TAAGTCAATA GGTTGTTTTA AACGAATGCC TATTTCAGTG CCATGGTCTG 1440 

TAACGACTCT TTGAATACGT TTAACAAGAT CTGAATTTTC AAGGTATACT TTTTCGACGT 1500 

so GCTTTTGTTT TTCTGAATTT GACAAATTGG CAATATTGCC TTGGATTTCT TCAACAATCA 156 0 

TTCTATGTTC CTCCTAGAAT AAGAAGTATC TTTGAGTTAA TGGTAACTCA GTTGCTGCAT 1620 

TACTTGTAAT TTTTTCTCCA TCTACATATA CTTCATATGT TTGTGGATCA ACGTCTAATT 1680 
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GACGCAC CAT GCGTTTTAAA TTTAATG CAC GATTGATACC ATTTTCATAA GCAGTTTTAG 1800 

ACACGAATGT CATTGACGTA CTTGTAAGGT TTCCGCCGTA TTGACCATAC ATTTTACGGT 1860 

5 

ACTTCATCGG TTCAGATGTA GGTATAGAAC CATTTGCATC GCCATTTACG GCAGAGTTAA 1920 

TTAATCCGCC CTTTACAACT AATTCAGGTT TAACCCCAAA GAAAATTGGG TCCCATAAGA 1980 

CAATGTCAGC TAGTTTGCCC GGCTCGATAG ATCCTACATA TTCAGAAATA CCATGTGTAA 204 0 

10 

TTGCTGGGTT AATTGTATAT TTAGCGATAT AACGTTTGAT GCGATTATTA TCATTATGTT 2100 

CAAAATCACC ATCTAAAGGA CCACGTTGTT CTTTCATGCG ATGTGCTACT TGCCATGTTC 2160 

15 GTGTAATTAC TTCACCTACA CGGCCCATTG CTTGTGAATC GGAACTAATC ATACTGAATA 2220 

CACCCATATC TTGCAGAACA TCTTCTGCTG CAATCGTTTC TTTACGAATA CGTGAATCTG 2280 

CGAATGCGAT ATCTTCAGGA ATAGCCGCAT TTAAATGGTG AGTAATCATT ACCATATCTA 2340 

20 AATGTTCATC TACAGTATTA TGTGTATAAG GCAAAGTTGG ATTTGTAGAT GAAGGTAAAA 24 00 

TATTTGAAAA TGCAGCGGAT TTAATTAAAT CAGGCGCATG ACCGCCACCA GCACCTTCAG 2460 

TATGGTACAT ATGAAGTACA CGGTCTTTAA CAGCAGCCAT TGTGTCTTCC ATAAATCCTG 2520 

25 

CTTCATTTAA AGTATCTGCA TGTAATGCAA TTTGAACATC AAATTCATCA GCAACATCTA 2580 

ATGCATGACT CAAAGCAGAT GGTGTTGCAC CCCAGTCTTC ATGTACTTTT AATCCAATTG 264 0 

30 CT CCGGCATT GATTTGTTCA ATGAGTGCAG TTGGATTTGT TGCTTGTCCT TTACCTGTAA 2700 

AACCGACATT AATCGGTAAA CcTTCGGCAG CTTCTAACAT TCTATGAATA TGCCATGGAC 2760 

CTGGAGTTAC AGTTGTTGCT TTAGAACCTT CTGAAGCACC AGTACCACCA CCAATATGAG 2 820 

35 TCGTAATACC ACTTTCTAAT GCGACCTCTG CTTGTTCAGG ATTAATAAAA TGAACATGAG 2880 

TATCAATACC ACCAGCAGTG ACGATTTTAC CTTCAGCGGC AATGATATCT GTTGTTGAAC 294 0 

CTATAATAAT GTCGACATTA TCCATTATAT CTGGGTTGCC GGCATTACCT ATGGCGAAAA 3000 

40 

TATAACCATT TTTAATGCCT ATATCAGCTT TAACCACTTT ATCGTAATCG ATAATAACGG 3 060 

CATTAGAAAT GACAAGGTGT GCAACGTTCA CGTCATCACG TGTTACACGA GGATTTTGCG 3120 

CCATAGCGTC TCTAATAGAT TTACCACCAC CAAAAGTAGC TTCTTCACCA TAAACGGCAT 3180 

45 

AGTCTTTTTC TATTTGAGCA AATAGATTCG TATCACCTAA ACGAATGGAA TGTCCAACAG 324 0 

TTGGACCGTA TAAG CTCGTA TATTGATTTT GCGTCATTTT AAAGCTCATG ATCTTTTTCC 3300 

SO TCCTTTTTTA TTCACGTTTT CAGCACCGTT ATCTCCGAAT ACACCTGCAT ATTCATCATT 3 360 

TTCATCAGTT GGGCGATAGA CACGTGACTC ATCGATAGGA CCATTGACCA T AC CA CG AAA 3420 

ACCAAAAATT TTACGTTTGC CAGCATATTC AACTAATTGA ACTTCTTTTT TATCCCCAGG 34 80 
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TTCGAAATCT AATGCTGCAT TTGCTTCATA AAAATGAAAA TGTGAGCCCA CTTGAATTGG 3600 

TCGATCTCCT GTATTTTCAA CTTCGATAAC TGTTTCAGGA TGATGGTTAT TAATTTCAAC 3660 

5 

CTCTGTACTT TTTGTAATAA TTTCTCCTGG TATCATTTGA CTGCCTCCTT TAAACAATAG 3720 

GGTGATGTAC TGTGATTAAC TTAGTACCAT CGGGGAACGT AGCCTCGATT TCGATATCTG 3780 

TAATCATGTG TTCGACACCA TCCATGACAT CTTCTTTGTT TAGAATTTGT CTACCATAAC 3 84 0 

10 

TCATTAACTC TGCAACGGTG TTACCATCGC GTGCACCTTC TAATAATTCA TCGCTGATTA 3900 

AAGCTAATGC CTCAGGATGA TTTAGTTTCA AACCACGTGC TTTACGACGA CGTGCAACTT 3960 

15 CCGCCGCCAC TACAATCATT AATTTGTCTT GCTCTCGTTG TGTAAAATGC AAATTAAAAC 4020 

CCCCAATTTC ATATTAGATA CaATTTACAA AATTTATATT AATCCTAATT GTTGTGATAA 4080 

ACAAGTAATA TACAAAGTTC AATGTGTAAT TAGAAAATTA TATTTTTAGC ATATCCGATA 414 0 

20 TTGAAGCAAA CAATCTAATC GAAAACAAAT AGTGGAATAT ATTTATGTAA AAACCAAAAT 4200 

AGTTTTTAAT ATAACTTTTC ATAGAATAGT AGTATATTAA TGAGTAATGA TTCAAAGGAA 4260 

AGGTGAAAGA TTTGAAGATA ATAGATGTGC TTTTGAAAAA TATATCTCAG GTTGTGTTAA 432 0 

25 

TTAGTAATAA ATGGACAGGA TTATTTATCT TAATAGGATT ATTTGTAGCC GATTGGACAA 43 80 

TTGGATTAGC GGCTATTGTA GGTAGCATCA TCGCCTATAC TTTTGCGCGT TTTATAAATT 4440 

3Q ATAGTGAGGC AGAGATTAAT GATGGGTTAG CTGGATTTAA TCCAGTGCTA ACTGCCATTG 4 500 

CGTTAACAAT CTTTTTAGAT AAGTCAGGAT TAGATATTGT TATAACAATG ATAGCAACTT 4 560 

TATTAACGTT ACCAGTTGCT GCTGCAGTGA GAGAAGTTTT AAGACCATAT AAAGTTCCGA 4 620 

35 TGCTGACGAT GCCTTTTGTC ATTGTGACTT GGTTTACAAT TTTACTTTCA GGACAGGTTA 4 680 

AATTTGTAGA TACATCGTTA AAGTTAATGC CTCAAAACAT TGAAACGGTT AATTTTAGCA 4740 

ACAATGATAG AATaCATTTC ATTCAGTCAT TATTTGAAGG ATTCAGTCAA GTATTTATCG 4 800 

40 

AAGCGAGTGT AATTGGTGGC GTATGTATTT TAATCGGCAT ATTGATAGCA TCAAGAAAAG 4 860 

CAACACTCTT AGCTGTTATA GCTAGTTTGT TAAGCTTTAT CATTGTAGCT CTATTAGGTG 4 920 

GTAATTATGA TGATATTAAT CAGGGATTAT TCGGTTATAA CTTTGTATTA ATGGCAATCG 4 980 

45 

CACTAGGATA TACATTTAAA ACAGCGATTA ACCCTTATAT TTCGACTTTT TTAGGTGTGT 504 0 

TATTAACAGT AGTGGTGCAA CTAGGTACAA CAACATTGCT TGAACCGTTT GG CTTACCTG 5100 

50 CATTAACATT GCCATTTATT ATCGTGACAT GGATTTTATT ATTTGCTGGT ATTAAACATG 5160 

ACAAAGTAGA TGCTTGATAG TTAAATCAAA CCTAATATTG TTTGAATATC ACCTTAAACT 5220 

ATACAGCGAA TTGTATAGTT TAAGGTGTAT TTTTATGGAT AAAATTAAGT GCATACTTAA 5280 
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GTGTTAAACT AGGAATAAAT AATTTATATT GTGTGTTGTG TGGGGTGACT AATATGAATG 5400 

ATATGGATAA TTCGTTTTTA ATAACAACGG AAATTCAAAG AAAATGGATT GAAAAATTCA 5460 

AAGTAATTAG AGATACATTT AAGGCTAAAG CTGAATATAA TGATCAACAT AGCCAATTTC 5520 

CATATAAAAA TATTGAATGG TTAATTAAAG AAGGTTATGG AAAATTAACG TTACCAAAAG 5580 

CATATGGTGG TGAAGGTGCG ACCATAGAAG ACATGGTTAT TTTGCAATCA TTTTTAGGCG 5640 

AACTTGATGG TGCCACAGCA TTATCTATTG GTTGGCATGT GAGTGTCGTA GGACAAATTT 5700 

ATGAACAGAA ATTATGGTCT CAAGATATGT TGGAGCAATT TGCTGTTGAA ATTAATAATG 5760 

GTGCATTAGT TAATAGAGCA GTTAGTGAAG CTGAAATGGG TAGTCCAACA AGAGGGGGAA 5820 

GACCAAGTAC ACATGCTGTT AAAG CTGATG ATGGGTATAT TTTAAATGGT GTGAAGACAT 5880 

ATACATCAAT GAGTAAAGCA CTAACACATA TTATTGTTGC TGCTTATATA GAAGAATTAG 5940 

20 AAAGTGTTGG TTTTTTCTTA GTAGAC 5966 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 
2S (A) LENGTH: 17310 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: double 
(D) TOPOLOGY: linear 

30 : t 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

CTGTGTCATC GCGAAATAGT TAGGGTCATT CATTAATCCT TTTGAACGTA TTTCATCAAA 60 

35 ATATAACAAT TTCATTAGTA AAGGGGACTT GTTCAAACCA GCTATAATAC AAAATAGACC 120 

TATAGTCACA CTGCTTATAA TATAAGAGGT AACGATCACT TTTTTGCTAT TAC CTAACTT 180 

AAAGSTGATC ATCCCTAAAT AGAAATAAAT GACTACAAAT GCATATTTAA CTGTAGATGC 24 0 

40 AAGAACTTCC TTAACCGTAA TAAATATCAA ATCATCAAAA AATaGCaAAC AArGCGTAAT 300 

AATCATACGA TATGTATACA AAATAATGAm AAACTGtmAA AAATGATTTG CCTTTAATAA 360 

ATGGTTAGCG AAAAACAGTA AATAAACTAA TATTAGTAAT GTGATAAAGT CAGCTATAGA 420 

AACATTCACA CCGGCAATAA CCGAAGATTG CTGAATAAAA ACCGCTAAAC CGATAAGTAA 480 

CAATGTTAGT AATTTACTAT TGTGTTGATT TTCCATTATA AACGTCTTCC ACTTCTTTAA 540 

TCATTTTCTC CTCAGTAAAA CATTCTAAAT AACGTTTTCT AGATTGATTA CTCATTTTGA 600 

TGTAATCACT GT CT ATT AAA TATTTTTCCA GGACTTTAGC AATAGTTTCG GGTTGGTTGT 660 

TCATCATACA TATACCATTA TCAGCTACTA ATTCTGAAAT ACCGCCAACA TGACTGGCTA 720 
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TTATTAAAAT AAACGTATCG TATTGTGATA 
AAAATGTGAC ATCATTTTCT AACCCAGCTT 

5 

TAGGTCCATC GCCTATAAAT GTAAAATGCG 
CTATTGCCGC GATTAGATTT TGTGGCAATT 
ATTGATGCTT TGTCGGGGCA TTAATCTGTA 

10 

CTTCGCCAAT ATTGTTATGT GATTGGCTTT 
CAATACCATT ATGTATTGTG GTTAATTTCA 
15 GTTTATCGAA ATCTGAAACA CAAATAATGC 
TAACTAAATA TAGAAATTTT TTAGCTGGTT 
CAGTAAAAAC TATACGTGTG TCTTTCGATT 

20 

TtCCAGCTTT GGAAGAATGT AAATGGATAA 
CTAACACTTT GACAGCTAAA ATATCTTGTT 
TAATAATTAC ATTAACTCTT GCATCTAGTT 

25 

TGACATAAAC ATCATTGTGT ACGCAAAAAT 
CACCACCATT GTCTGCTTTA GTAATACAAT 

3Q AATG CT AT AC TTTCAATTTC TTAACATGGC 
ATGCAAATCA ATGATGGCAC ATATTTCTTA 
TCGTTATACT GTAACAATTG GTCACAATCT 

35 AGTAAATTAA GACTACCTTG AGCCTTCCCC 
TATATATAGT TCCATCATTA AACTACCTTT 
TTGTTGCGGT GTTAAGTCAT ATCCACCTTG 

40 

AACAAGACAT CTTTGCTCGA AACCTATCAC 
ACGTTCCGGG CGTGGTCCAA T AAAACT CAT 
TAATTCATCA ATGCGTGTTT TACGAATAAA 

45 

TTTATCAGCC CATTGCGCAC CGTTTTTCTC 
TATTTTAATT AATTTACCCA TCTTCCCAAC 
SO CGAATCTATG ACGATAGCAA TGGCGAATAT 

AACAATGCTT AAAATTAAGT CAATCGCACG 
TTCTAGTTTG TCTAATTTTC TTTGATAGGC 

55 



ATAAATGACT CGCATTAATG ACATTGCCCA 840 

GTACAACTTG TTGCTGACAA TCATTTAATG 900 

CATGATTACT GTTATGTAAT TTCAATATCT 960 

TTGGATAAGC AAATCTTGCA ATCATAACAA 1020 

AATCTTGTTT ATTAGGCAAC ATTCCAACTA 1080 

TTAGCGTTTG CTTAACAGCG GGAACATCTG 1140 

ATCGATTAAA TCGATATTTT AACG CTAACT 1200 

TATCTGTAAT AAGTGACATT AATTTTTCGA 1260 

TAACACCCTC TGTAAAAGCC CATCCATGTG 1320 

TCGAAATGAa CT t CGCAATT CGTCcGACCG 1380 

CATCAGGTTT AATTTTCGAG AATAACTGTG 1440 

TAAAGTCAAT TGGACCTACT AAATGTTCGA 1500 

GTTCAATCAT TGGTCCATGA TTGCCTACAA 1560 

GGTTGGCGAG TTGAATGAGA TGTGTTTGTG 1620 

ATATAATTTT CAACTGTTAC . AAACCCCTTT 1680 

TATCTCATCA GATGAATAGT ATTTATAGCC 1740 

ATGCCATTTG ATACTGTCTC AAGGGATTCC 1800 

TTAAAATATA ACTTTTATTT GAACTTATTA 1860 

TGTAATAACA AC CATCAATG TTCTAATTGA 1920 

ATGTATATAT TTCATGTCAT ATTTCAGTTT 1980 

AATTTGCGCA AGTCCTGTTA ACCCTGGTGT 2 040 

TTCTGAACTA AATAATT CT A CAAATTC CGG 2100 

TTCCCCTTTA ACAACATTAA TTAGTTGTGG 2160 

CTTCCCGACA TTTGTTATAC GATCATCATC 2220 

TGCGTTTTTG CACATCGAAC GT AATTTGT A 2280 

TCTAACCTGA CTATAAATAG GGTTTCCTGG 2340 

AAC CAT AATC GGTAAAGTTA AAAATAATAA 24 00 

TTTAATTGGG TAATAGCTTT TTCTCACTTC 2460 

ATAACCCTTA TTATTATGGA CAGCTTCAAT 2520 
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AATTAAAGTA ATCCTTTAAA CCTGTTTCTA CTGTATATTT AGGAACAAAT CCTAATGCCT 2640 

TTAAGTTAGA AATATCTGCA TAAGAATGCT TAATATCTCC TTTTCGTGCT TCTTTAAATT 2700 

5 

CATGCTCGAC TGATTTTCCA TATAATTCAC CAATAATACG ATAAACCTCT AATAAATTAG 2760 

TAAAAGTGCC TGTACCAATG TTATAACCGT GTCCAATTGC ATCTTTGTGT TCCATAATTA 2820 

AG CGT ACAGA TTGAACAACA TCATATACAT ATACAAAATC TCTAGTTTGC AGTGCGTCAC 2880 

10 

CAAAAAATGT AAATGGCTTG TTATGCTCAA ATGAATCGAA CATCTTTGAA ATCACACCTG 2940 

AATATTGTGA CTTAGGATCC TGTCTTGGCC CAAATACATT AAAAAATTTA ACAACCGCTG 3000 

15 TTGGTATGTT ATATAACGAA CAATAATTTA ATGTCGTCCG TTCGCCGTAA TATTTATCTA 3060 

TTGCATATGG TGATAATGGT AAGATTAATG ATTGATCACT TTTAGGCAAA TCAGGAAGAT 3120 

CACCATAAAC AGCTGCTGAC GAAGCAAAGA TAAAACGTTT TATATGATTA TTATATTTTT 3180 

20 

TAATGATTTC TAACAATCTT AATGTTGCTA CGACGTTTAT TTCTTGAGAT AAGATAGGTT 3240 

TCTCAACCGA CTCAGCAACA CTAACTAATG CTGCTAAATG AATAACATAA TCAAATTGAT 33 00 

ATGTCTTCAT GATTTGTTCA ACTGCATCAT ATTCACGAAT ATCTAATTCA AACACATGAT 3360 

25 

CGTCAGCCAA ACTTTTAATA TTTTCTCGTT TACCTGTTCT ATAGTTATCT AGAACATAAA 3420 

CATCATAATC TTGTTGTAAA TCATCTACTA AATGCGACCC AATAAAACCA GCCCCACCAG 34 80 

3Q TTATCAAAAC TCTTTCCAAA TGTTCCACCT CATTTATACA TTAAAAATAT ATCATAAAAA 354 0 

CATAAAGTAT TGTAAGCTTT TTATCGATAT TTTTTATTTA TAAAAATAAA ATGAGATAAC 36 00 

TTTGTGAATT TTTATTGAGA TAAATTAGAT AGTGGTGTTT . TTGTGATGTT TTATAATATC 3660 

35 TTGGGTGTGT TAATAGTAAT AATGCTTTCA ACTGATGCAT TAGACTGTGA CATCATAACT 3720 

CACTTAAGAA CTTCGCTTAT TAATTTTCTA CCAATACACT CCCTTCTAAG TGCACTAAAA 3780 

AATCCTTACT GCTAAGTGAT TAAACTTAAC AATAAGGATT TATTTATCAT TAGTGGATGA 3840 

40 

TTATTAACGG AATCTCATAC CACCATCTAC AATAATTGTT TGTCCAGTAA TGTAATCAGA 3 900 

GTCTTTACCA GCTAAGAAGC TCACTACATT TGAAACATCT TCTGGTTGAG AAACTCTGCC 3 960 

CAAAGCAATC TGACTTGTAA ATTGTTCCCA ACCCCATGCT TCAGGTTTAC CTG CTTCTTC 4020 

45 

GGCTGTTGCC ACTGCGATAC TTTGCAT CAT TGGTGTTTGA ACGATACCAG GTGCGAATGC 4 080 

ATTCACAGTA ATACCTTCAG ACGCTAAATC TTGTGCGGCT ACTTGTGTTA AACCTCGCAC 4140 

SO TGCGAATTTT GTACTGCAAT ATAAAGACAA GCCTGGGTTA CCCTCAACGC CTGCTTGAGA 4200 

TGTTGCATTG ATAATTTTAC CGCCATGATT GAATTTTTTA AATTGTTCAT GTGCGGCTTG 4260 

AATACCCCAT AGCACACCTG CAACGTTCAC GCCATATACT GTTTTAAACT GTTCTTCAGT 4320 
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. GCCAAATTGC GCGGCAGTTT GTCTTAcTGC GTTAAATACA TCATCACGGT TTGATACATC 444 0 

TGCTTTGATA GCAATAGCTT TTGTACCATC ACTTGATAAT TTAAGTGCAG CTGCTTTTGC 4500 

5 CCCTTCTTCA TTGAAATCAA CAACTGCTAC TTTGAAACCA TCTTCCACTA AACGTTCTGC 4560 

AATTTTAAAA CCAATCCCTT GTGcTCCGCC AGTTACTAAT GCTACTTTGT TGTTTGTCAT 4620 

AAAGATCACT CCTCAAATTT CTTTC CTTTA ATTACATTTT ACTCCTCTTC ATTTGAATAG 4 6 SO 

10 

TACAACAAAG GTAG CTCCAT TTAACAAAAT ATTCAGATAT TTAAGGTATA GTTAAACGCA 4 74 0 

CTACCATTAG TGATTGGCAA TGCGTTTAAA TGTCGTTTTA AAAGTTCTTA TGTTGAATAT 4 BOO 

r5 TATTTTTTT A AGTCTCTCGA TTAGTTTGTC ATCAATCTTT TTTCGAGACA TGGTCTTTTG 4 860 

ATTCAATAGG CGGTTCCGTG TTATCACTGA CAACTTTAGT TGTAGCTTCA TCTTTATGTA 4 920 

TTTCTTCGTT AAATCCTTCA AGGTTTTTAG TCGTGGGATT TTTAACCTCA GGATGTTCCA 4980 

20 TCATGTCTTG ACTATCAAGT TCCTTTTTAC ACGTGTCTTT ATGTGATGCT TGATTTGCGT 504 0 

TCCCTTTACT TTTTTGAATA GTGGTAGTAT CTGCTGCAGC TACTAATTTT TTTCTACCTA 5100 

AAATAGATAT GGCTGAAACA AACCAGAGTA TTGCAGATAC AAAGTTGCAT AATACTAAAG 5160 

25 

CGATAATAGC CAATACAATT AATATGACAC CTTTTGAAAT CCTTTCTTTA AATAAGTCAG 5220 

ATGCCAATAC GATGACAGGT ACGATTGAAA GTATAATTAC AAATATAGAA- ATTATTGCCG 52 80 

ATATAACTAT TGTTACTATT AAATAATCAG CTCTGCTACC TGATAATAAA TAGAAAAGGC 534 0 

30 

CGAAAATTAG TCCATAGCAA ATTACAAACC CACATAAAGT TATAGCCATG AGTACTATAT 5400 

AAGCTATTTG AAAATATAAA CCTATCTTTA TGAATGATTT •■ TTCTACATTT TTTTCCATGT 5460 

35 CTATTCCCCA TTTATTTAAA ATTTATACTT TACCTTAAAT ATTCTCTTTA - TTCTTTAGTG 5520 

ATTTTATCTT TAGATTCAAA TTGATTCTCT GTACTTTCAA TATCAACTTT TTCATTTTCG 5580 

TCTGTCGATT CATCTTTTGA GTATTTATTC CAAATCAGCA AAATACCACC AATCAGCCAT 5640 

40 

AAAATTGACG AAAGGAAATT ATATAAACAC AGTGCAATAA TAGCATAAAC AATAAAAAGT 5700 

GCACCTCCGA TTACAGAGTA ACTTTCCATA TAAATCGCAG TAAAGATGGT TGGTAAAACA 5760 

GTGAAAAGAG CCAATATTAA TCCTAATAAA AAAATTGTTT CGTAATCAGA TCCTCCAGCA 5820 

45 

ATATTAATAG ATATCATCCT AACAAAAACG ACACTAAAAT ATATTTGAGC TACGATGCCT 5880 

ATCCAAATTG CTATTTTTCC TATAATTGAG CTCATACTCA TTCCCCATTT ATTTAAAATT 5940 

6 q TATACTTTAC CTTAATATAC CTTATTTTAT TTAATTTTTA TATGCAAAAT ACAAAAATGG 6000 

AGAACTTCAA TATTTATAAA ATATCAAAAG TTCTCCACAC TATATTGTTT TATTATATTT 6060 

TCGCTATCAA TACGCTAAAT CATCATATTT CCCTCAACAT CACAGTAAAA CTATTGCTCC 6120 

55 



286 



EP 0 786 519 A2 



10 



15 



20 



25 



30 



40 



45 



50 



TTCCAATTGC 


GCAGTTGTTC 


AACATCATCA 


TCTTGTTTAA 


GTAATGCCAG 


TGGTACTTGA 


6240 


AGATTAAGAC 


ATCGTCCTGA 


AATATTAAAG 


CGTGTCACAC 


CTGCTGGCAC 


AGTTTCCCCT 


6300 


TTATGAACAA 


CCGCTTCAAT 


TTCCTTATAA 


CTCAATGGCT 


GATACTTCAT 


' GAGTACATCT 


6360 


TGTTGAGAAA 


GAGAAGGATA 


TGTACCTTGT 


GCAATTCTCT 


CTACAGAACA 


ACAACCACTA 


6420 


TAACTTGCGA 


CAACCTTTTC 


CCATACTTGA 


AAATGTGCTT 


CGCCTAAATC 


TTTTGTATAC 


6480 


AAATATTGTT 


CTGTATCACC 


ATGACACATT 


GTAATAAATG 


GCGCTTCTTG 


TCTTGTCTCA 


6540 


GTAGTCGATG 


GCAAGCGATG 


TTCTTGTTGT* AACGTTTCGG 


ACCACACACC 


AAATGGAACT 


6600 


TTATGTTGCC 


ATGTACTAAT 


TGAATATTGT 


GTTTCATGGA 


TTTCTTGCAC 


TGGAACTTTC 


6660 


TTACATCCTA 


ACGCTTTCAA 


ACTTGTATAC 


CGATGCAGAG 


CATCTATAAC 


CATATATCTA 


6720 


CCATGTTGCA 


TCGCTGTCAC 


TAAAATAGGA 


TGACGT AT AA 


AATCATCTGC 


TTCAATACTA 


6780 


vi"i"i-i'CGTTT TTTCCAATCT 


TAAAGGTTCG 


AATGTTTCGT 


GAAGATCAAT 


CTTATCTACT 


6840 


GGTACCAATT 


TTAAATGTTC 


ATGAATATGA 


TTCAATAGTT 


ATTCATCCTC 


CTTTGTTTGT 


6900 


GTTAAATAAA 


TAAATTCAGG 


ATGTGGATGG 


CTTAAGAAAf 


CGTGATGTGA 


AATAGACCAT 


6960 


CCGTATGCAC 


CTGCATATTT 


GAAAACAATA 


ACGTCGCCTG 


TACTGATTGC 


GTCTATCTGT 


7020 


ACTTCTCTAG 


CAAAGACATC 


TTTGGGTGTA 


CATAATTGAC 


CGACTAACGT 


TGTGTCCTGT 


7080 


CTCGAAATTG 


AAACTTTTTC 


AAATGAATAT 


GGATTGTCCT 


TATAGCGATA 


AATGTCAAAA 


' 714 0 


GGATGGTTAT 


GTTGCGAAGA 


TACCGGCAGT 


CTAAATTGTT 


GCGTACCTCC 


TCTTAATATG 


7200 


GCATACCAAG 


CACCATGTAC TTTCTTAATG 


TCTAGCACTT 


CTGTCAGATA 


GTAACCAATA 


7260 


TGTGCCACAA 


TAAAGCGCCC 


ACATTCAAAG 


TTGAATGTCA 


CATCTTCCAT 


TTCTTGCTCA 


7320 


ACGATAAGTG 


TTTTAAAACG 


TTCTACAAAA 


TTATCCCATT 


CAAATTGGTT AGTTAAATCT 


7380 


GCAXAGTTAA 


CGCCTATGCC 


ACCACCAAGA 


TTGATATGTT 


TGAGTGGAAA 


TCGATGTTTT 


■> 7440 


TCAGACCATG 


CCTTTGCTTT 


TTTAAAATAA 


AGTTTCACTA 


CATCGACATG 


TAAATTCGAG 


7500 


TCTAAATTGT 


TAGAAATAGA 


ATGAAAATGA 


AATCCATCTA 


GATGAATCTT 


TGGCATTGCG 


7560 


AGCGCAgcTT 


CAATGACATC 


ATCAACTTCG 


TCTTCAGAAA 


TACCAAATTG 


TGTTGGGCGT 


7620 


CCTGCCATAT 


GCAACGTTGC 


ATTGGGAAAT GGTCCTGCTA 


AATTAACACG 


CAATAAAATG 


768 0 


TGTTGTGTCT 


TATCTTCATC 


TTCTAAGATG 


GCATTTAGCC 


GTTGTAATTC 


ATGCATACTT 


7740 


TGAACATGAA 


TACGGTGAAC 


ACGTTGACTT 


ACTGCATATC 


TTAGTTC CTC 


GTCTGT CTTA 


7800 


GCAGGGGCAC 


CAAAAATAAT 


ATGATTTGCT GGTTTAAAAG 


CAAGACCTTT 


TGCTATTTCA 


7860 


CCTTGAGATG 


CAACTTCGAA 


TCCTTCAACA 


TACTGACTAA 


TTGTATCTAG 


GATTTTTCGT 


7920 



55 



287 



EP 0 786 519 A2 



TGTTGCAAAT GATGTTCCAG TCCGACTAAA TCATAGATAT AATGACAAAC TGGATGAGAT 804 0 

TGTGCTTTTA ATTGTTCAAT AACAGGTTGA ACTATACGCA TTAGCCTTCA TCCCCTTTCT 8100 

5 

.GTTTAGACGT CGCTAGAGAT GCACTTAAAT GGCGATATAT TTTTCCGCGA TCATCACCTA 8160 

AAATAAATGT TTGTACACCT TGTGCCTGCC ATTTTGCAAT ATCTTCATCT TCACGTGGTA 822 0 

ATGCACAAAA ATGTTTACCA TGTGCATTCA CAACTTCAAA AATATGTTGA ACATGTGATG 8280 

10 

TTACTTGATC ATCACGCGTT TGCCATGGTA TGCCAAGTGA CTGCGATAAA TCTGCGGCAC 834 0 

CTTCGACTAT CATGTCTAAA CCTTCX5ACTT GTGCTATATC GTCAATGGCC ATAACCCCTT 84 00 

1S CAACATCTTC TATCATGGCA ATCACCATAA TATGCTCATT AGCCATCTCC ATTGCATCAA 8460 

GTAATGGTGT ACGTCCAAAT CTTG CCATGC GACCACCATT CAAACTTCTT AATCCTTGCG 8520 
GGTAATAAOG ACTTAATTTC ACAATATGCT CAACTGTCTC ACGATCTTTA ACGTGTGGCA - 8580 

20 CAATAATACC TCTCGCACCC ATATCCAACA CTTTAATGAT ; ATCTCTATCT ATCACTGCAG 864 0 

TGACACGTAC AATTGGTATA ATATGCGCTG CTTCAGCTGC ACGAATTAAA TGCGCTAGTG 8700 

TCTCATCATT AATCGCCACG TGTTCTGTAT CAATCACAAC AAAGTCATAC CCGCTTGCTG 8760 

25 

CGATAACCTC GATCATCAAT GGGTCCGGTA TAGAATTAAA AATGCCATAA ACTGAATCAC 8820 

CATTGTTTAA TCTATGTTTC AGAGATAGTT GTTGCATCAT TGATACCTCC TACACCTAAT 888 0 

GGATTTGTAA CATG ATGAAT TCTTAACTCG GAGTCACTTA ATAATCGACG TGTCGTTAAC 894 0 

30 

TTTTCAACTT GAATCGTAGG TTCAAACAAA TCGAAATGTT GATAGTTATT CAACT CTGGA 900 0 

AATGCTTCTT GATACGCCTC GATGATGCCT TTAACCCATT GCCATTGCAG CTCCTCATCG 9060 

35 ATACCATATT GCTTTTCAAT AAATAAGATG ATTTCGGCGA TATTAATAAA GAAAAATGCA 9120 

CAATAAATGA ATTACTATTC 9180 

GTGAAGCAGC TTCACTTAAA 924 0 

AGGCAATACG TGTAGGCCAA 93 00 

CAAAGGCAAT ACCGTGATAA 9360 

ATTGCTTTGT CCAAGCTTCA 9420 

CATCCTTATC ACTTGCATAA. 9480 

TATGATATAT ATTTTCACGC 954 0 

AAGGCGAAAG TTGTGTATTT 96 00 

TTAATTCATC TTTTAAATAC 96 60 

GCGCTGCATT TTCAATTGTA 9720 



TCATGTAAAA AGTCGCGTAC TAAACGTTCG TCATCTGTTT 
ACTTCTTTAT GTGCTTCTGG CATTGGCTTT AATGT CAGGT 
TGctCACGCT TAAAACGAAC ACCATCATGG AAATCTTTTA 
CCATTTTCAT G AATG AG CAT CATATTTTGT GCATGCGATT 
TAAAGCATAT GAATCATTGG ACGAATCGCT ACAGCTAAAA 
GAAC CAT ATT GTTTAATCCA ATTTTCAATG AATGGTACAC 
AGTGCATTAA ATGGTATCGC ATCCTCTTCA TCGATTAACA 
CATATAACAC CTAACGCACC ATAAACTTGA GTTTGTTTAT * 
AAATAAGACT GTCCTAAGAC TTCCCCTAGA AAAACTGTCT 
ATATCTTGTT GCTGTATCTG CTTTAACCAA TCCGTAATTT 
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TATTTTGTCG 
TCACTTTCCC 
5 ATGAGATGTT 
CCAGATGCTT 
AACATTT CGT 

10 

GCTAACCACT 
AACGTAAATC 

1S AATTCATAGT 
CTTTGCGTAT 
TCATTTTTAG 

20 TCTGCCTCAT 
TGTGTCTTTT 
ACACCGTCTT 

25 

AGTTGGTGCA 
CTCCTTGTTA 
ATTCAATTTA 

30 

TACAGAGTCA 
CGCTACAAGT 

3S ATGTC CTAAG 
ATATACAACA 
CGTT^TAGAT 

40 TTTTAATTCA 
CAGCTTTAAT 
TAAACCACTT 

45 

AAATAATGAC 
AACAATCATG 
ATTCGGTAAC 

50 

CTCATCTTTG 
GTCATTCGTA 
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TGTCTATTGG CGACATGGTA CGAATCGATT 
CTAACCATAG TACTGTGCCA TTAAGCCTTT 
CAAACTGCCA TGGGTGTACA GGTATGATCT 
CAATTTGCTG TACAAAATGT TCATAAGTCT 
TAACTACAAC ATTTCTTGAT ACCGTCGTTT 
GCAGTTTAAC GTTTGGTACA AAATCAGGAC 
CTAAACGTGA TTTGTAACTT GGATGATACT 
CGTTAAATGT CTCAGGTGTT GCTGGTGGGT 
CTTTTAATTC TGTCTGTAAT AACTCGACAA 
GAAATGTAAA TACAAGCTCT CTCAATAATT 
CTCCTACGAC ACGCTCAATT GGTGATGTGA 
CAGCAGTAAA ACGATACTCT GAATCATGTC 
GATATGACGC TTTATAGACA ACAATATTCT 
TCACTCTAGT CTTTACACGA TTAAGAATTG 
TGACAAATTG GATTTGGTAT ATGTGTATAA 
CTCATCAAAT TCGCTTTAGC CGcAATGGTC 
ACAAATACTG CGTTATTCGC GTATTCTTTT 
TGC CATAACA CAACTTCATT TCTAGTCGCT 
TGATTTACTA CAACGTAATA TTTAAGACGA 
GGGCTTGATG CTGCCACAAC ATTTGGCACA 
AGACAAATGC CTTCAAGATC TCTGACAAAG 
ATTAATGTAT TTTGTACATG TGCTTCTAGA 
ATCGGCAATA ATGTACGATT CAAATAACAT 
TGCTCAATCA CTTGTGATAA CTTAGACATC 
GCCAATACAT GAATATCTTT ATCAGCATGG 
GCACTATTTG TTAATAAATC CATTTCAGGT 
AATGCACGAT ATCCTTCTTC AAACATCAAT 
ACTGATGCGA TAACTTGCGC GGCATCAATT 
CGTATAAAAT TAGTGATTTT AACGTGTATC 



GTTGAGGGTG ATATAGCTCA 9840 

CTTCAGCCAA ATCAACTTGG 9900 

CAACATCATT TACATGTTTG 9960 

TATCGCCAAC TTGTTGACGT 10020 

CTACTTTATC TTTGTCGATA 10080 

CAAATTTCAA ATTATCACTC 10140 

GATGCCCTTC CATCGCATAA 10200 

TTGATTCTCG ATACTGCATA 10260 

TAAATTGTTC TAGCTTTTCA 10320 

GTGTATAGTC TGTTGTTGTA 10380 

TACGTATACG ATCAAAGCTA 10440 

CTTCTATTGT AAAATGACCG 10500 

CATAAATAAG TGATGATACC 10560 

TTTGATTCAC AATACGATAC 10620 

ATAGGGTTTG CACCACAATC 10680 

GGCGTTTGAT ATAAATCTTC 10740 

TTC CAAGTCA TAAGACGATG 10800 

TTACCAATAG TTGATACTAA 10860 

TGCCATGCTT CATCATGTGC 10920 

AGCTGTTTTT CAGTAGCAAT 10980 

CATACGTCGG GTATGCCATG 11040 

CTAATGCCTG TGTTACTAAA 11100 

TCAAGCCATG CTTCTGGTGC 11160 

GGTGAATCAG GCATCGTTTC 11220 

TAATTCGGTA TCCCTTCACG 112 80 

TCAACTGTTT GCCCTAATGG 11340 

TTAAAATGGG GTGTTTCAAC 114 00 

GTCCGTTCAA TCTGTTCAAG 11460 

GGTAATTTTA AATAAATGTT 11520 
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GCCAAGGTCT 
CACATTGATT 
ATCTATGTCT 
GCGCGTGAGC 
TTCGGGTGCA 
ATG AAATGGA 
GTCTATTGTG 
CATAATTTGC 
TTGCAAAATA 
AAGCGATACC 
CGTTAAAGTT 
ATATAAATTT 
ATGTTGTATT 
TGTGATCGTT 
ATACTACGCC 
CACTaAGACT 
TGCCTTTAAG 
CACTATATGT 
AACGTTGCAG 
CATATGATTT 
ATGCGGACTG 
AAGCAAGTGG 
CGATAATAAA 
CTTTATTAAT 
CACCGAAAAT 
CTAATATCGA 
AACTTTGCAG 
ACGCACTTGA 
ACTGTAATGG 



TTTATTAAAC 

TGATAAGGAT 

GCTAATTGAT 

AGAACATCTT 

TATTTCTCTA. 

TGACCTAAGT 

TTACTTTGCA 

GCCATATGTT 

CGCGCAATTG 

TCTGGATGAT 

TCGAGCTCTG 

TCTTCTCTAA 

AATTCTTTAT 

GATTTGATTA 

CATAACGATA 

GCCAATAATT 

TTGTTGATGA 

TAATCCTTGA 

TATCGCACTA 

ATCATTAAAG 

TAAAAATCCA 

TGATAATGCA 

TCGACATGTT 

ATTTGGTGTT 

AGAGACAATA 

AGCTGTAACA 

TCTTCCTAAT 

TGCATCAACA 

TGTCGTACAC 



CTTGTTCACT 
GTGTTGGTAA 
ACAACACTTT 
GATGCACAGC 
AATCTGCTTC 
ATAAAGATTG 
AATAACGTGC 
GTTGCACTGC 
CTTCTTTATA 
ACATATGATG 
ATAATTGTAT 
AATATTCATT 
TTTGCACTTT 
GTGATGGTTG 
AACGTAGTAG 
TGACCAACAA 
CACGCATTCA 
AGTATTCTTG 
CAACCACATG 
CGTCCCCATA 
ATCACACTAC 
GTTAGCATGC 
TGTTGTGTGC 
TGTGATTTTG 
AAAGTAATAA 
CCGCCAATTA 
ACCTTTCCAC 
ACACCACCAA 
AATGCCATTA 



ATATTGCATA 
TAAAATAAAA 
CTCAACCTGA 
TAAATAATGC 
TGAAAACCCA 
TTCTGAAACG 
CGTGCGATGA 
CGTTTGATTA 
AGTTGTTATT 
CCCCATCGCA 
AGACCATTGA 
TAAAATGCGT 
TTTGTTTCAA 
AACAAATTAA 
CTGGTGTAGT 
CTAACATACTV 
CGACAACAAA 
CAGCCATTAA 
CAATCGTGGC 
AAGGCGCGCT 
GGTCATCTAT 
CATACATAGC 
ATAATAGACA 
GCATATGTGT 
CGGCAATACT 
ATGGCCCCAC 
GATCTTCAGC 
ATAGTCCCTG 
AAAATAAGCA 



TACTGTGGAT 
TCTTTGGGTA 
TCTTCTTTAC 
AATTGGAATG 
CTTGCACTCT 
ATATAACGAT 
ATGCTATTAT 
TCTGCACTTT 
TTTTTACTTT 
GACCAATAGC 
TGATTTTGAG 
TCGATAGCCG 
CTCCCATAAT 
AAATAAACTA 
ATAACTTGTA 
GTTCGTCGTT 
CATGACACTT 
AAACTCTATA 
AAATATATAT 
TAATATCGAA 
CGCTGTATGA 
AAAGTTTGCT 
TTGAAATGAA 
CGTTTCAATC 
CATCAGTAAC 
AAGAGACCCT 
TGGCGCCTCT 
CAATAACCTC 
TACCGCCAAA 



GCTGTCX3CAA 
TCTCTGATAT 
CTTCTACATA 
ATGTATGACA 
TAGGAGTCGG 
CCTCTACGTA 
CGATGTCAGA 
GAGCCATATG 
TT C CATCGAT 
GAAATTCACC 
GTGGTACTTG 
CATACGCTGC 
TTCATTAATG 
CTTACTGCAA 
ATGGCAGCGC 
CCAACAAATG 
TGAATCAATG 
.TTCGTCGCTA 
ACTGATTTAA 
GCCGTCCAAA 
TTCACTGATG 
AAAACGCCAA 
CGGCGAATAC 
AATTTTAATG 
GCACTAAAAC 
GCGCTGACTG 
GCACTCGCAA 
ACAAGTACAA 
CCAAGTAACG 



X1640 
11700 
11760 
11820 
11880 
11940 
12000 
12060 
12120 
12180 
12240 
12300 
12360 
12420 
12480 
12540 
12600 
,12660 
12720 
12780 
12840 
12900 
12960 
13020 
13080 
13140 
13200 
13260 
13320 
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CtATCATCGT CGTTACAGCT GGAGCAGCAA TCGCTATACC ACTCCACAAC TGTATTTCTA 13440 

CGACTGATAG ATTTTGTAGT GATGCCATAT AAATTGGCAA TAATGGCACA AGTACTGTCA 13500 

GTCCAGCAAT CGCTATAAAC TGACTGAGCC ATAAAATGCG AAAGTTACTG CGCCATATAG 13560 

ACTGATTAAT CATATGTCAC CATTGGATTT GGTACGGTAG TTAAACCTGA AGGCATACTA 13620 

CCTCCACGAC TATCACGTTG ATATAGCAAT GGTAATAAAA TTTGTTTGAA TGGCCACGTC 13680 

TGTTTATCAA ATAAAATGTG TCTGACAGCT AGCTGATCAG TTGTAACCCA GGAAATAGTT 13740 

GCCACTTCAT TTTTTAAAAT TTGTTTTAAC AACGACATAA GTTCATGCTC ACTTACACCA 13800 

AATAAATCTT GAATTGCATC AATAATGGCA TATAGATTTA CCGATACAGC TAATGTTTGA 13860 

AAATAAGCAA AGAATGTTTC CAAATCCTCA TTAATTAGCG TATTAGGTGT ATCTTCTCTG 13920 

ACGACATACT TCGGCAATGA AAGCTGATGT GCTGTTAGCC ATGGTTTATA AATTCTGACA 13980 

GTATCATGAT CACGTAACAC GCATTTTTGT ACACGTCCAT CTTCAAATGA CAACAATATA 14040 

TTTTGACCAT GCAACTCTGG TAATGCGCCG TATTGCATAA ATGATAGTGT TACCTTTAAA 14100 

AAGACTTGCG CGATATCTTC AAATAACGTC ATGACATCAT TTTTAGAAAT ATTATCTTTT 14160 

CCACAAATCA TTTGATATAA AGTGCGATCA TTTGCCGCGA GTGCTGCCAT TGACACTAGC 14220 

TGTTGCGTAT CATTTTTGGC TAGCACTTCG GGATACTTTC TTAGCTGAAC AGTTAGATGA 14280 

CCTAATTGAT CTTTGAAAAT ATCATTATCT TGACCCATAT ATGACCACCA AGCTGTTTCA 14340 

TCACAAACCA TGACATACTT AGCTAGTGCT T CAT CTTTTT CTATAAGCTG ACGTAATAAT 144 00 

TGTTCTGCTT GTTCTCCGTT TTTCATGTAA CGCGTAGGCG TTAGCCTTAA TGCGCCTAAT 144 6 0 

GACTGCATTG CAAATGGTAC TTTGACATGG TTATACGGTG CGCCAATATC AATTAATGAA 14520 

CGCATACTTG AAGACGACAG ATAATCTCCA AATTTTAACG GTAATAGTAC AACCAACTTT 14580 

TCACTAATCT CTTTCGCAAA GACGTTCGGC AGAATATGCT GATATTGCCA AGGATGTACC 14640 

GGAAATAGTA CATAGTCATC TATTGATAAC CCTTGATCAT TTAACATGTC TGTCGCTTGT 14700 

TCTTTTATAG GTACTGTCAA ATTTTCTAAT TCATCGATAT TTGCAGTATC GCCATGAATC 14760 

ATATGTGTCT TTTTAACTGC TGCAACCATT AAAGGAAATG ATTGATTTAA TTCAGCTTGA 14 820 

TACACTTGAT AATCCGCTTC TCTTAATCCT CTTTTTTCTT TAGCTAATGG ATGAAATGGA 148 80 

CGATCT TTT A AACTTGCAAA CTGCTCTGAC ATCACAAAAG GATGTGACGC TAAATCTAAT 14 940 

TCTGATAATT GTTTAGCAAG CTGTGTGGCA GCAGTAGTCA GTCCTTCTTC AACGCGAGCC 15000 

ACTTCCCATT CATGACTTAG ATCACAATTC ATATTAGCAA TTGTTTGCCA AAATTCAGCT 15060 

GCCGTTAAAG GTTGCTTAGA CACCCTTCCC TCTATCGTAA TTGGTTGTGA ACTTTCGTAA 15120 
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TATATCAAAA GCGTTTGTCC GTTTTCTTTA GTAATCTCAC TATTCGATAC AATTCCGGCT 1524 0 

ATATCTTCAA ATAATAATGC ATCAACTAAA TCTCTTAATA TTATCG CTTG TGCTGTATTG 15300 

5. 

ACTGCTGTAT GATTCTGCAA TGTTCAGACA CCTCGCATTC TTAATATAGG TTCAATGTTG 1536 0 

TCCCAATATT TTGTTGTTGT GCCTGTTGAT AAATAAAATA AG CACTTG AA ATATCTTCGA .15420 

TAGCCATACC CATCGGATTA AGTAATATGA TCTCATCATC GTCTTCACGT CCTGGTATGT 15480 

10 

- - •' CACCTGTCAG AAGTTGTCCT AGTTCAGCAT GAAGAGCTTC TTTGCTGAAT TTACCTTCTA 1554 0 

ACACCAATTG GTTAATAGTT TTCTTTTCTC GATTACATTG TGACCAGTCA TCTACTACGA 15600 

15 CTTTGTCAGC TTTAATAAAG ACTTCTTTAT GCACATCCAT GATAGAAATG TTGCTAATAA 15660 

ATGCACCCTT TTGTAACCAA TCATATTCAA TGTATGGTTG ATCCGTTACG GTACATGTAA 15720 

TGACTACTTC ACCATTTGAT ACTGCTTCTT TAGCATTTTC TGTCGCAATA AAATTAATTT 15780 

20 

CCGGACGCTG TTGTTGCCAT CTATCAACAA AGCGTGCACA TGCTTCAGAG AATTGATCGT 15840 

AAACAAACAC GCGTTCAATA TGATCGAATT GCTCTAACAT ACTTTGTAAT TGCTTGTCTC 15900 

CGATTAGCCC GCATCCAATG ATTGTTAAGT CTTTAAATCC TTTTTTAGCC AAATGCTTTG . 15960 

25 

CTGCAATCAC TGAAACTGCT GCAGTACGCA TACTACTAAT TAAACTTGCT TCCATAACTG 16020 

* " CAATTGGATA ATTCGTTTCT GGATCATTCA AAATAATGAC GCCACTTGCA CGCTCCATAT 16080 

30 TACGTTTCGA TGGATTGTCG TGCTTACTAC CTATCCACTT AATACCTGAA ATTGCGTGTT 1614 0 

CACCACCGAT ATGACTTGGC ATTGCAATAA TTCGATCTGC GATGTGTCCA TTTTCAGGAT 16200 

CCtGTCTTAA ATACGGCTTA ' AGCGGTTGTA CAAAATCATT GTGCGGATGG GCTGTTAATG 16260 

35 CTTCTGTTAA TGCGTCCACA TAAACTTGTG AATGATTACC TCCCGCTTGT * TCAATATCTG 16320 

ATCTATTTAA ATACAACATC TCTCTatTGa TTCTGaTTTA ACTCCTTGTC TTGATTTCAT 16380 

TTTTTCTAAC CATGTATCTG AATAAACTAA ATCTAAGTAA CGATCGCCTC GATCTGGTAA 16440 

40 

AATCGTGACA ATTGTTGGAC CTTCTTCAAT TGACGTTATC AACTGCTCAA TCGCTGCAAT 16500 

AATCGAACCT GTTGAACCTC CGGCAAATAT GCCTTCATAA TCAATCAGTT TTCGACAGCC 16560 

CAAAGCAGAT TGATAATCAT CTACATGGAT CACTTGATTA ATTT CTG AT C TATTCAATAT 16620 

45 

TTCGGGTACA CGACTAGCAC CG AT AC CAGG TAATTCTCTA TTAATAGGTT TGTCACCAAA 16680 

AATGACTGAC CCTTTCGCAT CAACAGCAAC AATTTGTGCG TTTGGATGCA CTTCTTTTAT 16740 

SO TTTTCTACTC ATACCCATAA TGCTACCTGT CGTGCTGACT GGCGCGACAA AATAATCTAT 16800 

AGGTTGCTTA ATTGTTTCAA CAATCTCTGT GCCTGCACCA TGATAATGGG ATTGCCAATT. . 16860 

TAACTCATTC GCATATTGAT TAATCCAATA TGCATCGTCA ATAGTGGCTA ACAGTTCTTG 16920 
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TACATTGGCA CCATAACTTT TAATAATTTT CAAATTTGTT GGTGATATTT TAGGATCAAC 17040 

AACACACGTG AGTTTTAATC CCTTGATTTT AGCTATCATT GCCAACGCAA TGCCTAAATT 17100 

ACCAGAAGTA CTTTCAATTA AATGTGTATT CTCAGTGATT AAACCATGTT TAATACCATG 17160 

TTCAATGATG TACTTGGCAG GTCGATCTTT CATGCTGCCT CCAGGATTCA TATACTCTAA 17220 

CTTTGCAAAC ACTTCATGTT TCGGAAATAG TTGATGAAGT TGAAC CAT AG GTGTTTGCCC 17280 

TACAGAATCT AACAATGAAT CGTGCACATG 17310 
(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5423 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) . TOPOLOGY : linear 



25 



35 



45 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

ATACTAGTAA GCGCATCGGT TATTGACATC GAATTCAACT TTAACAGTTT TCATGTTCGG . 60 

TGATGTTTCa ATAGAATGTG TGTGTTGTAC TTGCGCATTT ATATTTCCAC CTAAATTACT 12 0 

TAAGTTTCCT GTAATACTAG AAATGTCAGG TGCGTTTAAT GTAGGTTGAA ATGCATCAAC 180 

TACTTTATCT GCAACATTAG AAACATTACG GATAACTTTA CTTGAATGAT TAT CTATACC 240 

TTTAACGAAA CCTAACATTG AATACATACC AACATCCATG AATTCACGTG AAGGTGAGTG 300 

AATACCTAGC GCTCTTTTGG . CTGCATTTAA AG CAC CTTTT GCTACACTAG CTGCTTTTTC 3 60 

AGCTAAGTCT CTAGCCATAT TACCAATACC TCTCATCAAA CCACGGATCA TATCAGCACC 420 

TGCTGATACA AAGTCATCCA CAAAGCTTTT AACTTTATTT ACTGCATTTG TCATACCTTG 480 

ACTAACTTTG TTTACAACAT TAACGAATCC TTGAACAACT CTATTAACAA rGTTAATTAG 54 0 

CGTACtTGTt ATAGTAGATA CCCaTnGCAT AC CTTTAGTG ACmATGAAGT TCCAAGCTTG 600 

AGACATTTTG TCTGATATAG TTGAAACAAC TTGTGTGAAT ATGCTTACAA CTTTATTCCA 660 

AATTGTCGTT AATATACCAG ATAAGAAACT CCAAATCGTA TTCCATATAT TAGAAATAAA 720 

ACTCCATGCC GCTTGTAACG CAGTAGATAT AG CTGTAGTG ATAGCGTTCC AAACCTTAGT 780 

TGCCACAGTA ACTATAGTGT TCCACAACGT TTGTAAGAAC GTCCAAATAG CGTTCCAAAT 84 0 

TGTTATTGCG ATAGTCATAA TTGTGGTAAA CACTGTAGTT ATTACAGTGA CTAACAAATT 90 0 

CCAAATCGTA GTAGCGATTG TAATTATCGT ATTCCAGATT GTACTTAAGA ACGTCCAAAT 960 

AGCTGTCCAT ATCGTCATAA CTATTGTCAT TATCGTCGTG AAAACAGTTG TAATGATTGT 1020 
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ATAAGCGACT ATTTGATTCC AAACAATCAT TATAAAATTG TAAACATTCG ATACTGCTGT 1140 

AGTGATAGCT GTTAAAATAG CATTCCATAC AACCGAAGCT ACAGCTTTTA ATACATTCCA 1200 

5 

AACATTAACC ATAAACGTTT TTATCGCATT CCAAGCATTT ATAATAAAGT TTCTGAATCC 1260 

TTCATTTTTA TTCCACAATA AAACGAATAT AGCTATTAAT GCAGCAATTA CACCAATTAC 1320 

1£} TATTGTTATT GGACCGCCTA AAATACCAAA CACAGTTACT AGTCCTGTGA TAGCATTTCT 13 80 

AATTAATC C A ATCTTACCGA ATAACAATTG GAATATAACT GATATAATTT TTAATGGTCC 1440 

TTTTAATAAC ATGAACGCAC CTTTTAAAAT TGTTAATCCC GCTCTTAATA AACCGAACTT 1500 

1S ACTTACTAAT GCAATGrTTC TACCTATTAA TCCGCCACCC ATAAAGTTAG ATACAGCAAG 1560 

AATAATCGGT ATTAAAAATC TAAATGCACC AACTAAAGTT ATAATGACAC CAACTAATTG 1620 

TGCTGTAGCT GGATGCGCCT CAAACAAGTT AGCTATCCAA CCAGTTATTG CAACTGCAAC 1680 

20 GCGTAATACT GCACTAGCTA TAGGAGCCAT TGCTGTTGCG AATGCArmTA ATCCTCTTGC 174 0 

GATGTTTCCA ATCAATTGCA TTATTAGTGG TCCATTTGTT TGTATATAAC TGACAAAGTC 1800 

TTTAAACCCT TGAGATTGTC CTACTTGTTC AGACCATTCC CTAAACTTAG CTGTCATTTG 1860 

25 

TTCAAGAGAT TGGAATATGC CAGTTGATGA TCCGCTGAAT GCATTCATCA AATTGTTAAT 1920 

TCCAACGAAA ACATTTTTGA AAATATTACC AATGATAGGT AAGTTTGTTT TTGTGTATTC 1980 

30 - AATAAAACGA GTTATCGAAT TTTCTCCAGC TGCACTATTA GCCCAGTTAG AGAAAGATTG 2040 

ACCTAATCTA TCCAACCAAT CAGCCGACCA TTGAAACAGT GGTGCTAATT GCGTGAATAC 2100 
- ATTGACTAAT CCGTCACCAA AACCACCTGC AGCACTTAAT AGCTTGTTAA ATACCGAAAC ' 2160 

55 ACCCGTTGTA TTCATCATAT TAAAGAATCT TGAAGCTACA : CTGCTATTTT CAGCCCATTT 2220 

AAGCACGCTT TGAGACGCTT CTTCCATTCC TCTTGAAATA CCACTAAAAA ACGGTTGTAA 2280 

GCTCTGCATT GCAGTTTTAA CAGTATTTAA ACCATTTGCA AGAGTTGTGA AGATAGCGGA 2340 

40 

TTGATTTTGC TTTATAATAT CAGTCCATGC TGACTTTACG CCATCTAACG CTTTTTTGTA 2400 

TTCGTTTGTT GCTGAGCTAG CTTGTAAAGT GCCATCATTA AGCATCTTTA TAGCGCTGAT 2460 

AGCCATTGCG CCAAACGCTA CAAATCCTGC TCCCGCTATT GCTACGGCAC CACCTAAAGC 2520 

45 

AAGTACACCA CCAGTTAACA CTTTGATAGC GTTTAATAGC GCAAATACTA CAGGTACTAC 2580 

GCTCGCTATT ACAGGTATTA AGATACTAAA AGATGATGTA AGTAATCCAC CAACCATATT 2640 

50 AGAACCTACA GTACCGAACA CACGGAACAT ATTAGCTAAA TTCCCCATCT GTCTTTGAAA 2700 

ATTGTCATTT GCTTTTATTA TGTAGGCATA AGCTTTCTTT AAACCATTAG TATCGACATC 2760 

TACCTTTGTT GTTTTTTTGT TCGGCAATGC GTCTAATGAT TTTTTAAACG CATAAATAGT 2820 
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AAGTTCTTCT TTAGTACGTT TGATTTTAGA 

AGCTTTGGCT TTAGACCTAT TTAATGCTTC 

5 

ATTGAATTTA CTGTTATCTG CATTGACGTC 

TAATTTAGCT TCTGTTTCAG CGATATCTTT 

TGGTGTAACT TCTTTAGAGT TTAGTTTGTC 

10 

TTGTAAATCT TGTATACTAG CATCTAATTT 
TAAAGACTTT TTAGCAACTT TGATAGTTTT 

is AACATCTTTA GTTTGATCTG CTACTCGTTT 
AATTTGCCTT TTGAATTTGG CTACACTAGC 
CACATTAACA CCTCTCTTTC TATT G CTTAT 

20 TATTTTGTGG TTCGTATTCA TCACGTTCGC 
GCCGTTGGAT ATTTTCTTCA TAAGGCAATA 
CTTTAGGTTT ATTTTCTGTC CCAACATTTT 

25 

CAAGTTTGTA ACGTTCGAAT TCTTGGGTTA 
AGTTATATTC TGTTAATGTC ATTTGCTCAA 
ACATACAAGT TATAACGATT CTGTCGTAAG 

30 

TTCCAGTACT TCGACTAGGT TTCGGGTCAT 
GGAAGGGAAT TCTTCf AGTC CGATATTTTC 

35 ATTAATAGTA ATTGCTTGTT TTTTTAAGTG 
CACAACCGGA TTTCCACTTT CTAAACCTAC 
AGCTTGTTCA ACTTTTAAAC CTAATCGGTT 

40 ACTTAATTCT AATGACTTTC CGTTAATTTC 

ATTAATTTAA ACAAAATAAA mArGCTTAAC 
CGGTGGTGAA TCTACTTTAG GTTGTGGAAT 

45 

TGCTTTTGTA GTGTCGTGGA ATCTGTATcG 
AGGTAGTGTT GCAAATCCAC GTTGGAAACG 
so ATCAATACCG TTAGCTTCTG CTTTTAATTC 
CGCTTTAAAT TTAGCGGAAT CCCCATTTTT 
TTCATACAAT ACGCGATCTA CAACTGCATC 

55 



GTTAGCAACA CCATTGTCCA CGTCTATAAT 2 94 0 

GAGACTAGCT TTAGATACTT TTAACACTCG 3 000 

AATATf GACA CGTTTCTTTT CTAATTCTGA 3060 

AATCAACTTT TGTTTTTGCA ACTTAACTTC 3120 

TAGTTCAAAA TTCGATTCTA GTACCTTTTG 3180 

AG CTTTTACA TTTTTGTTAC TAAAGGCATC 324 0 

TTGTAAATTT TTATCGTTAG CGTTTAATTC 33 00 

AAATCTTTGC ACAGACTTAA CCGCACTATC 3360 

TTCAATAGTC GCTTTAATTT TATATTCCGT 3420 

TAAATTCTGC TATAACTTTA AAGAATTCAT 34 80 

TACTAAATCT TATATCTTTA CCTTCGTTAA 3540 

CGTCGTTTGC ATTGTTAAAA ACATATTCCT 3600 

TAGTAGCTGC AGCATCACGA ATAGCAAACG 3 660 

GCATTTCATA CTCTTTCGCA TACATTCGAT 3720 

TAACGTTCAA ATGTGTAATA CCAAGTGTTG 3780 

TTATTAGGcT TCCGCTGGTT TTTCTTCCGT 3 84 0 

AGGTCGCTTT CCCAAcTCCG TTAAAATATC 3 900 

TG CGATTTCA TCTAATGCTT CAT CAATGTT 3 960 

AGATGTAGCT GCGATTAAAA cTTCG CCAAT 4 020 

AGGCAACATT GATACACCTT GACCGATAGA 4 080 

AT CGATTTCT CTTAAAAATT TAAAACCAAA 4140 

TACATTCATA ACTTAAAATC TCCATTCATA 4 200 

GCCCTATTTT TATACCTCTC TTGGTGCAAC 4260 

TGCTGTTAAA TCTTCGCCAG TTAATGCATC 4320 

AGTCGCCTTA AGTTTCTTTG TTACAGCCTC 43 80 

AC CATTCACT CCATATTCAT ATTCATATTC 444 0 

AAATTTATTG TGGAAACCTT GGAAATATTT 4 500 

GCCTGGTATT CTACTTTCAA CTTCCCAAGC 4560 

TTCAATTTCA TCTGCAAAAT GGTCACCATA 4620 
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GTCCATTGTA TCCTCTGTAT CTGTATCAGC TTCATGTGAT AAGCCGTATT CAGTTAAAAA 4740 

AAGCATTTTA GTAGCATCTA CTTTTTCGCC AGCTTTTCTA AATAAAATAA TACGATGATT 4 800 

ACTATTTTTC ATATTTGCCA TTCAATATTC CTCCGTTTTT TAAAATGTTT TGTAAGATAT 4860 

CGTTACTGAT GTGTGTAGCA ATTCTTGATT GGTAGTATCA TCAACTAACT GTGTGATGTT 4 920 

AGTATCTTCT TCTTCAAAGT CATAATCGTT TGTTTTAACG CTAGGTGTTA AATCATCAAT 4 980 

. ACATCTTTTA ACAAGTCCGT CATGATGTCC TAAATCATCG CTTACACTCC AAATATCAAT 5040 

AACTAAATTC GTATCGCCAG AATAACTATC AAACGTGTAC TTACTTCTAT TTGACTCCGG 5100 

IS CATTTTTATT ACAAAAAAAG GATACGGAAT CTCTTGTTGC ATCTCTTTAC GAGAAATAAC 5160 

AGGGAATCCA TATCCTTGTA GCGTTTCATA CGCTTTATTA TAAAGTTGTA AGTTCGGTGT 5220 

CATGCTTTTA TCTCCTATTC AAACAACGCT TTCAATTCTT CTACAGTTGA TTTCCTAATC 5280 

20 ACTTCGTATA CCGGC CACAT AAAAGGTTCA GCCTCCATGT ATCGAGTACC AAATTCTAAG 5340 

AAACCACTAT AAGCTGCGTG CGATGTGATA GTGTATTGCA AATCGCCAGT TTTTTTATAT 5400 

CTGATATTGC GTGATaAATT ACC 5423 

25 

(2) INFORMATION FOR SEQ ID NO: 25: 

> - (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6251 base pairs 

(B) TYPE: nucleic acid 
30 ( C ) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

35 X: (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: . . . 

AAACGCAGAT GTTCAATTAG AACCAGTCTA TCGTATTAAG GAAGGTATTA AACAAAAGCA 60 

AATACGAGAC CAAATTAGAC AAG CGTTAAA TGATGTGACA ATTCATGAAT GGTTAACTGA 120 

TGAACT AAG A GAAAAATATA AATTAGAGAC CTTGGACTTT ACTTTGAACA CATTACATCA 180 

TCCTAAAAGT AAAGAGGATT TATTACGTGC TCGTAGAACC TATGCATTTA CTGAACTGTT 240 

TTTATTCGAA TTACGTATGC AATGGCTAAA TAGATTAGAA AAGTCATCTG ACGAAGCAAT 300 

TGAAATTGAT TATGACATAG ACCAAGTTAA ATCATTTATT GATCGTTTAC CTTTTGAACT 360 

AACTGAAGCA CAGAAATCCA GTGTTAATGA AATTTTTAGA GATTTAAAAG CACCAATACG 420 

so TATGCATCGA TTACTTCAAG GTGATGTAGG TTCAGGAAAA ACAGTAGTTG CTGCAATTTG 4 80 

TATGTATGCG TTAAAAACTG CTGGTTATCA ATCAGCATTG ATGGTACCAA CTGAAATTTT 540 

AGCAGAGCAA CATGCTGAAA GTTTAATGGC , TTT ATTTGG A GATTCTATGA ACGTTGCATT 600 
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TACGATTGAT TGTTTAATTG GAACCCATGC TTTGATTCAA GATGATGTGA TTTTC CAT AA 720 

TGTTGGTTTA GTAATTACAG ATGAACAACA TCGATTTGGT GTGAATCAAC GC CAGCTTTT 780 

AAGAGAAAAA GGTGGAATGA CGAATGTGTT ATTTATGACA GCAACGCCGA TACGAAGAAC 84 0 

ACTAGCAATA TCAGTTTTTG GtGAGATGGA TGTGTCTTCA ATTAAACAAT TACCAAAAGG 900 

TCGTAAACCT ATCATTACTA CTTGGGCAAA GCATGAGCAA TACGATAAAG TTTTGATGCA 960 

AATGACCTCA GAGTTGAAAA AAGGTCGTCA AGCATATGTC ATTTGCCCGC TAATAGAAAG 1020 

TTCTG AG CAT CTCGAAGATG TTCAAAATGT TGTCG CATTG TACGAGTCTT TACAACAGTA 1080 

15 TTATGGTGTT TCCCGTGTAG GGTTATTGCA TGGTAAGTTA TCTGCCGATG AAAAAGATGA 1140 

GGTCATGCAA AAGTTTAGTA ATCATGAGAT AAATGTTTTA GTTTCTACTA CTGTTGTTGA 1200 

AGTAGGTGTT AATGTACCGA ATGCAACTTT TATGATGATT TATGATGCGG ATCGCTTTGG 1260 

ATTATCAACT TTACATCAGT TACGCGGTCG TGTAGGTAGA AGTGACCAGC AAAGTTACTG 1320 

TGTTTTAATT GCATCCCCTA AAACAGAAAC AGGAATTGAA AGAATGACAA TTATGACACA 13 80 

AACAACGGAT GGATTTGAAT TGAGTGAACG AGACTTAGAA ATGCGTGGTC CTGGAGATTT 144 0 

CTTTGGTGTT AAACAAAGTG Ga TTGCCAGA TTTCTTAGTT GCCAATTTAG TTGAAGATTA 1500 

TCGTATGTTA GAAGTTGCTC GTGATGAAGC AGCTGAACTT ATTCAATCTG GCGTATTCTT 1560 

TGAAAATACG TATCAACATT TACGTCATTT TGTTGAAGAA AATTTATTAC AT CGTAGTTT 162 0 

TGACTAATTG CCATG CTGAT TTGTCAATTT GAGTGCAACa CTTCGTTAAT TGAGTGATAT 16 80 

GACACTTGAA CTATTTAAAT GTAAAGTGGT ATTTTAACAA TTTATAAATT TTCGACTAAA 174 0 

35 TAATAGCTAA ATATTACAGT TATTTGTTGA GTCGGTTAAA TAGAAAGTGT TATGATATGT 1800 

GAGGAATGTT TAAGACTAGG TACTAAAAAA TGAGGGGTGA GACGTTGAAA CTAAAGAAAG 18 6 0 

ATAWVCGTAG AGAAGCAATC AGACAACAAA TTG AT AG CAA TCCCTTCATC ACAGACCATG 192 0 

40 - 

AACTAAGCGA CTTATTTCAA GTGAGTATAC AAACAATTCG TTtAGaTCGC ACTTATTTAA 1980 

ACATACCAGA ATTAAGGAAG CGTATTAAAT TAGTTGCTGA AAAGAATTAT GACCAAATAA 204 0 

GTTCTATTGA AGAACAAGAA TTTATTGGTG ATTTGATTCA AGTCAATCCa AATGTTAAAG 2100 

45 

CGCAATCAAT TTTAGATATT ACATCGGATT CTGTTTTTCA TAAAACTGGA ATTGCGCGTG 2160 

GTCATGTGCT GTTTGCTCAG GCAAATTCGT TATGTGTTGC GCTAATTAAG CAACCAACAG 2220 

so TTTTAACTCA TGAGAGTAGC ATTCAATTTA TTGAAAAAGT AAAATTAAAT GATACGGTAA 22 80 

GAGCAGAAGC ACGAGTTGTA AATCAAACTG CAAAACATTA TTACGTCGAA GTAAAGTCAT 2340 

ATGTTAAACA TACATTAGTT TTCAAAGGAA ATTTTAAAAT GTTTTATGAT AAGCGAGGAT .2400 
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TTAGAAGCCG TACAAAAGGC TGTTGAAGAC TTTAAAGATC TAGAAATTAT ACTTTTCGGT 2 520 

GACGAAAAAA AGTATAATCT GAACCATGAA CGAATCGAAT TTAGACATTG TTCTGAAAAG 2580 

5 

ATTGAAATGG AAGATGAGCC TGTTAGAGCG ATTAAACGTA AAAAAGATAG CTCAATGGTA 264 0 

AAAATGGCTG AAGCTGTGAA ATCTGGTGAA GCAGATGGAT GTGTGTCAGC AGGTAATACT 2700 

GGTGCTTTAA TGTCAGCTGG TTTATTCATT GTTGGACGTA TTAAAGGTGT AGCTAGACCG 2760 

10 

GCTTTAGTAG TAACATTGCC AACGATTGAT GGAAAAGGTT TTGTCTTTTT AGACGTTGGT 2 820 

GCAAATGCTG ATGCTAAACC TGAACACTTA TTACAGTATG CGCAACTAGG GGATATTTAT 2880 

75 GCTCAAAAAA TTAGAGGTAT TGATAATCCG AAAATCTCAT TATTAAATAT AGGAACCGAG 2940 

CCAGCTAAAG GTAATAGTTT AACGAAAAAA TCATATGAGT TATTAAATCA TGATCATTCA 3000 

TTGAATTTTG TTGGGAATAT TGAAGCGAAG ACATTAATGG ATGGCGATAC AGATGTTGTA 3 060 

20 GTTACCGATG GCTATACTGG GAACATGGTC CTTAAAAATT TAGAAGGTAC TGCAAAATCA 3120 

ATCGGTAAAA TGTTAAAAGA TACGATTATG AGTAGTACTA AAAATAAATT AGCAGGTGCA 3180 

ATATTGAAGA AAGATTTAGC TGAATTCGCT AAAAAGATGG ATTACTCAGA ATACGGTGGT 3240 

25 

TCCGTATTAT TAGGATTGGA AGGTACTGTA GTTAAAGCAC ACGGTAGTTC AAATGCTAAA 33 00 

GCTTTTTATT CTGCAATTAG ACAAGCGAAA ATCGCAGGAG AACAAAATAT TGTACAAACA 3360 

ATGAAAGAGA CTGTAGGTGA AtCAAATGaG TaAAACAGCA ATTATTTTTC CGGGACAAGG 3420 

30 

TGCCCAAAAA GTTGGTATGG CGCAAGATTT GTTTAACAAC -AATGATCAAG CAACTGAAAT 3480 

TTTAACTTCA GCAGCGAACA CATTAGACTT TGATATTTTA GAGACAATGT TTACTGATGA 3 54 0 

35 AGAAGGTAAA TTGGGTGAAA CTGAAAACAC ACAACCAGCT TTaTTGaCGC aTAGTTCGGC 3600 

ATTATTAGCA GCGCTAAAAA ATTTGAATCC TGATTTTACT ATGGGGCATA GTTTAGGTGA 3660 

ATATTCAAGT TTAGTTGCAG CTGACGTATT ATCATTTGAA GATGCAGTTA AAATTGTTAG 3720 

40 

AAAACGTGGT CAATTAATGG CGCAAGCATT TCCTACTGGT GTAGGAAGCA TGGCTGCAGT ,3780 

ATTGGGATTA GATTTTGATA AAGTCGATGA AATTTGTAAG TCATTAT CAT CTGATGACAA 384 0 

AATAATTGAA CCAGCAAACA TTAATTGCCC AGGTCAAATT GTTGTTTCAG GTCACAAAGC 3900 

45 

TTTAATTGAT GAGCTAGTAG AAAAAGGTAA ATCATTAGGT GCAAAACGTG TCATGCCTTT 396 0 

AGCAGTATCT GGACCATTCC ATTCATCGCT AATGAAAGTG ATTGAAGAAG ATTTTTCAAG 4020 

SO TTACATTAAT CAATTTGAAT GGCGTGATGC TAAGTTTCCT GTAGTTCAAA ATGTAAATGC 4080 

GCAAGGTGAA ACTGACAAAG AAGTAATTAA ATCTAATATG GTCAAGCAAT TATATTCACC 4140 

AGTACAATTC ATTAACTCAA CAGAATGGCT AATAGACCAA GGTGTTGATC ATTTTATTGA 4200 
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AACATCAATT CAAACTTTAG AAGATGTGAA AGGATGGAAT GAAAATGACT AAGAGTGCTT 4320 

TAGTAACAGG TGCATCAAGA GGAATTGGAC GTAGTATTGC GTTACAATTA GCAGAAGAAG 4380 

5 GATATAATGT AGCAGTAAAC TATGCAGGCA GCAAAGAGAA AGCTGAAGcA GTAGTCGAAG 4440 

AAATCAAAGC TAAAGGTGTT GACAGTTTTG CGATTCAAGC AAATGTTGCC GATGCTGATG 4500 

AAGTTAAAGC AATGATTAAA GAAGTAGTTA GCCAATTTGG TTCTTTAGAT GTTTTAGTAA 4 5 SO 

10 

ATAATGCAGG TATTACTCGC GATAATTTAT TAATGCGTAT GAAAGAAGAA GAGTGGGATG 4620 

ATGTTATTGA CACAAACTTA AAAGGTGTAT TTAACTGTAT CCAAAAAGCA ACACCACAAA 4680 

15 TGTTAAGACA ACGTAGTGGT GCTATCATCA ATTTATCAAG TGTTGTTGGA GCAGTAGGTA 4740 

ATCCGGGACA AGCAAACTAT GTTGCAACAA AAGCAGGTGT TATTGGTTTA ACTAAATCTG 4800 

CGGCGCGTGA ATTAGCATCT CGTGGTATCA CTGTAAATGC AGTTGCACCT GGTTTTATTG 4860 

20 TTTCTGATAT GACAGATGCT TTAAGTGATG AGCTTAAAGA ACAAATGTTG ACTCAAATTC . 4 920 

CGTTAGCACG TTTTGGTCAA GACACAGATA TTGCTAATAC AGTAGCGTTC .. TTAGGATCAG 4 980 

ACAAAGCAAA ATATATTACA GGTCAAACAA TCCATGTAAA TGGTGGAATG TACATGTAAT . 5040 
25 ' 

ATATTTGAGC TAAAGCTCAT TGACGCAGTG GTTGACTGGT CATCCAATGG. AGAATTGTCT 5100 

GACCTAGTCA ACTTTGCGGG GGAAATTCTA AGCAAGCTAG ATAAGGTTCC AGAATTTCTC 5160 

CCTAAGAAAC ACTAATCAAt aAATTGwTAA GTGTTTCTAA AATTTCTACT TGTTTTTTAG 5220 

30 ■ • ■ - . ' • , , • ' 

AATTTAAAAT GGGAAAATAT AGTAGTCTAT GTATAGGCAT TTTTAAAGGA GGTGAATCGA 5280 
CGTGGAAAAT TTCGATAAAG TAAAAGATAT CATCGTTGAC CgTTTAGGTG TAGACGCTGA ■ 534 0 

35 TAAAGTAACT GAAGATGCAT CTTTCAAAGA TGATTTAGGC GCTGACTCAC TTGATATCGC . 54 00 

TGAATTAGTA ATGGAATTAG AAGACGAGTT TGGTACTGAA ATTCCTGATG AAGAnGCTGA 5460 

AAAAS^TCAAC ACTGTTGGTG ATGCTGTTAA ATTTATTAAC AGTCTTGAAA, AATAATAAAT 5520 

40 - _____ 

CTTACATCTG GGTCGTCAGT ATTGTCGACT CAGTTTTTTT CTTTAATTAT CAATAGTTTT 5580 

AACGTAAAAT TAAAGATGAT TCAAGAGCAA CACATAAAGG AGATAAAATA ATGTCTAAAC 564 0 

AAAAGAAAAG TGAGATAGTT AATCGTTTTA GAAAGCG CTT TGATACTAAA ATGACAGAGT 5700 

45 . 

TAGGCTTTAC TTATCAAAAT ATTGATTTAT ACCAACAAGC ATTTTCGCAT TCGAGTTTTA 5760 

TTAATGATTT TAATATGAAT CGTTTAGACC ATAATGAGCG TTTAGAGTTT TTGGGTGATG 5820 

SQ CGGTATTAGA ATTGACGGTT TCACGATATT TATfTTGATAa ACATCCCAAC TTG CCAGAAG 58 80 
GGAATTTAAC AAAAATGCGT GCCaCTATTG TATGTGAGCC CtCACTkGTA ATAXTTGCGA , 5940 

ATAAAATTGG ATTGAACGAA ATGATTTTAC TTGGTAAAGG TGAAGAGAAA ACAGGGGGAC 6000 
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ATCAAGGACT AGATATAGTT TGGAAATTTG CTGAGAAAGT CATTTTCCCA CATGTAGAAC 6120 

AAAATGAGTT ATTAGGCGTG GTAGATTTTA AAACACAATT CCAAGAATAT GTGCACCAGC 6180 

5 

AAAATAAAGG TGATGTAACC TATAATTTAA TAAAAGAAGA GGGACCGGCA CATCATCGTC 624 0 

TATTCACTTC A 6251 
(2) INFORMATION FOR SEQ ID NO: 26: 

10 

.. (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4920 base pairs 

(B) TYPE : nucleic acid 

(C) STRAND EDNESS : double 
is (D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

20 ACCTACTGAA GTTGCTAATT TTTTGGAGCA ACTAAGCACT GAAATTGAAC GTCTTAAAGA 60 

AGATAAAAAA CAACTTGAAA AAGTAATCGA AGAGAGaGAT ACTAATATTA AGTCTTATCA 120 

AGACGTGgCA TCAATCTGTA AGTGaTGCTT TGATACAAGC TCAAAAAGCT GGTGAAGAAA 180 

25 

CTAAGCAAGC TGCAGAGAAA CAAGCTGAAG CGATTATAGC TAAGGCAGAA GCGCAAgcTA 240 

ATcAAATGGT TGGTGACGCG GTAGAAAAAG CACGCCGTTT . AG CATTCCAG ACTGAAGATA 3 00 

30 TGAAACGTCA ATCAAAAGTA TTTAGATCGC GTTTCCGTAT GTTAGTTGAA GCGCAATTAG - 3 60 

ACTTATTAAA AAACGAAGAT TGGGATTACT TGTTGAATTA TGATTTAGAC GCTGAACAAG 420 

TGACGCTTGA AAATATTCAT CATTTGCATG AAAATGATTT . AAAGCCAGAT GAAGTTGCAG 4 80 

55 CAAATGCACA AAATAATGCA TCAAATACAC CAGACAATAA TCAACAATCC AATGATTCAG 540 

AAACAACTAA GAAGTAAGAA TTAAATAAAG ACAGACGCGT AATATACATT TAACTTTTCA 600 

CAGCGAATTA GGTAATGGTG AGAGCCTAGT AAAAGCATGT ATGTTATATC ACTGGCTTTT 660 

40 - 

TAATATTTAA ATAATGTAAT GAGAGAACTC TAAGTTGAGT TAATAAGGGT GGTACCGCGA 720 

GCAATCGTCC CTTTTAATTT AACTTAGAGT TTTTTAAATT TTTAAGGAGT GAAAAAAATG 7 80 

GATTACAAAG AAACGTTATT AATGCCTAAA ACAGATTTCC CAATGCGAGG TGGTTTACCA 840 

45 

AACAAGGAAC CGCAAATTCA AGAAAAATGG GATGCAGAAG ATCAATACCA TAAAGCGTTA *: 900 

GAAAAAAATA AAGGTAACGA AACATTCATT TTACATGATG GCCCACCATA CGCGAATGGT 960 

SO AACTTACATA TGGGACATGC CTTGAACAAA ATTTTAAAAG ACTTTATTGT ACGTTATAAA 1020 

ACTATGCAAG GGTTCTATGC ACCAT ACGTA CCAGGTTGGG ATACACATGG TTTACCAATT 1080 

GAACAAGCAT TAACGAAAAA AGGTGTTGAC CGAAAGAAAA TGTCAACAGC TGAATTCCGT 1140 
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TTAGGTGTTC GTGGTGACTT TAATGATCCA TATATTACAT TAAAACCTGA ATACGAAGCT 1260 

GCACAAATTC GTATTTTTGG AGAAATGGCA GATAAAGGTT TAATTTATAA AGGTAAAAAG 1320 

5 

CCAGTTTATT GGTCTCCTTC AAGTGAGTCT TCATTAGCAG AAGCAGAAAT TGAATATCAC 1380 

GATAAACGTT CAGCATCAAT TTACGTTGCA TTTGACGTTA AAGATGACAA AGGTGTCGTT 144 0 

GATGCAGATG CTAAATTTAT TATCTGGACA ACAACGCCAT GGACAATTCC ATCAAATGTT 1500 

10 

GCGATTACCG TTCATCCTGA ATTAAAATAT GGTCAATACA ATGTAAATGG cGAAAAATAT 1560 

ATTATTGCAG AAGCCTTGTC TGACGCTGTA GCAGAAGCAC TGGaTTGGGA TAAAGCATCA 1620 

15 ATCAAATTAG AAAAAGAATA CACAGGTAAA GAATTAGAGT ATGTTGTAGC ACAACATCCA 1680 

TTCTTAGACA GAGAATCGTT AGTGATTAAT GGTGATCATG TTACTACAGA TGCTGGTACA 174 0 

GgTTGTGTAC ATACAGCACC AGGTCACGGG GAAGATGACT ATATTGTTGG TCAAAAATAT 1800 

20 GAATTGCCAG TAATTAGTCC AATCGATGAT AAAGGTGTAT TTACTGAAGA AGGCGGCCAA 1860 

TTTGAAGGGA TGTTCTATGA TAAAGCTAAT AAAGCCGTTA CTGATTTATT AACAGAAAAA 1920 

GGTGCACTAT TAAAATTAGA CTTTATTACA CAT AG C TATC CACACGACTG GAGAACAAAA 19 80 

25 

AAACCTGTAA TCTTCCGTGC TACACCACAA TGGTTTGCCT CAATCAGTAA AGTAAGACAA 204 0 

GATATTTTAG ATGCAATCGA AAATACAAAC TTCAAAGTAA ATTGGGGTAA AACACGTATT 2100 

3Q TACAATATGG TTCGTGACCG TGGCGAATGG GTTATTTCTC GTCAACGTGT GTGGGGTGTA 216 0 

CCGTTACCAG TATTTTATGC TGAAAATGGC GAAATT AT CA TGACGAAAGA AACAGTGAAT 2220 

CATGTOGCTG ATTTATTTGC AGAACACGGT TCAAATATTT GGTTTGAAAG AGAAGCGAAA 2280 

35 GACTTACTAC CAGAAGGATT TACACATCCA GGCAGCCCTA ACGGTACATT TACTAAAGAA 2 34 0 

ACAGACATTA TGGACGTTTG GTTTGATTCT GGTTCATCAC ACCGTGGCGT GTTGGAAACA 24 00 

AGAGCGGAAT TAAGTTTCCC AGCGGATATG TATTTAGAAG GTAGTGACCA ATATCGTGGT 24 60 

40 

TGGTTCAACT CTTCTATCAC AACTTCAGTT G CTACAAGAG GAGTATCACC TTATAAATTC 2520 

TTACTTTCTC ATGGTTTTGT TATGGACGGT GAAGGTAAGA AAATGAGTAA ATCTTTAGGT 2 580 

AATGTGATTG TACCTGACCA AGTGGTTAAA CAAAAAGGTG CTGATATTGC GAGACTTTGG 264 0 

45 

GTAAGTAGTA CGGACTATTT AGCTGATGTT AGAATTTGTG ATGAAATTTT AAAACAAACA 2700 

TCTGATGTTT ATCGTAAAAT CAGAAATACA TTAAGATTTA TGTTAGGTAA CATTAACGAT 27 60 

50 TTCAATCCTG ACACAGATAG CATTCCTGAA TCAGAGTTAT TAGAAGTGGA TCGTTACTTG 2 820 

CTAAATCGTT TACGTGAATT TACTGCAAGT ACGATTAACA ACTATGAAAA CTTTGACTAC 2 880 

TTAAATATTT ATCAAGAAGT TCAAAACTTT ATCAATGTTG AGTTAAGTAA TTTCTATTTG 2 940 
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CAAACAGTGT TATATCAAAT TTTAGTTGAT ATGACGAAGT TGTTAGCACC AATCTTAGTG 3060 

CATACAGCTG AAGAAGTTTG GTCTCATACA CCACATGTTA AAGAAGAAAG TGTTCACTTA 3120 

5 - 

GCAGACATGC CTAAAGTTGT AGAAGTAGAT CAAGCTTTAT TGGATAAATG GCGTACATTT 3180 

ATGAATTTAC GTGATGATGT GAACCGTGCA TTAGAAACTG CTCGTAATGA AAAAGTTATT 3240 

10 GGTAAATCAT TAGAAGCTAA AGTTACGATT GCTAGTAACG ATAAATTTAA TGCATCTGAA 3300 
TTCTTAACTT ~ CATTTGATGC . ATTACATCAA TTATTTATCG TGTCACAAGT TAAAGTTGTA . 33 60 

GATAAGTTAG ACGATCAGGC AACAGCTTAT GAACATGGTG ATATTGTCAT CGAACATGCA 34 20 

15 GATGGTGAAA AATGTGAAAG ATGTTGGAAC TATTCAGAGG ATCTTGGTGC TGTTGATGAA 3480 

TTGACGCATC TATGTCCACG ATGCCAACAA GTTGTAAAAT CACTTGTATA ATTGAAATTG 3540 

TATAAAGTAC TCATACAGAT GATATAAATT AAAGCTCTCT TCATAATCAT GTTGTAGTTT 3600 

20 

TTGTTGACAT GATGAAGAGA GTTTTTTTGT GAATAAAAAA ATGACCAAGT TACCGGTCAT 3660 

ATATGTAAAA AATGTGCGAT TTACTAAAAT AAAAATTATT CAGGAATGGT ACAAATTCTC * 3720 

TGAGGCATAT AAATGCGTTA TAGTTGCTAT TCTCAATTAT GTTCGCGATA ATTTTAAGTA 3780 

25 ' 

AAAGTAAGCA CAGATATTGA ATTTGATAGG AGTTAATTGA ATGTATCATA ACAGTAACGC 3840 

AAACTTTGTC AATGGTATCA CTTTAAATGT GAGAGATAAG AATGAATTAA AGCCATTTTA 3 900 

30 ' TGAGGACATA TTAGGATTAA AT ATTATAAA TGAGACATTA ACATCGATAC AATATGAAGT 3960 

AGGTCAAAAT AATCATGTCA TTACACTTGT TGAATTACAA AATGGACGTG AACCTTTAAT 4020 

GTCCGAAGCG GGACTGTTTC ATATCGCAAT TT^AACTACCT CAAATTAGTG ATTTAGCTAA 4 0 80 

35 TTTACTAATT CATTTAAGCG AATATGATAT TCCAGTTAAC GGAGGTATAC AGCCTGCTTC 4140 

GTTATCATTA TTTTTTGAAG ACCCGGAAGG AAACGGTTTT AAATTTTATG TTGATAAAGA 4200 

CGAAGCGCAA TGGACGAGGC AAAATAATTT AGTAAAAATT GATATTAGAC CATTAAATGT 4 260 

40 \ 

ACCGAGATTA GTGAGTCATG CAACAAAATT GTTATGGTTA GGTATTCCAG ATGACGCTAT 4320 

TATAGGTGCA TTGCATATTA AGACAATTCA TTTATCAGAG GTAAAAGAGT ACTACCTCGA 43 80 

4S TTATTTTGGA TTAGAGCAAT CGGCATATAT GGATGATTAT TCAATATTTT TAGCATCGAA 4440 

TGGCTATTAT CAACATTTGG CCATGAATGA TTGGGTATCA GCAACGAAAC GTGTAGAAAA 4500 

TTTTGATACG TATGGATTAG CAATTGTTGA CTTTCATTAT CCTGAAACAA CACATTTAAA 4 560 

SO TTTACAAGGT CCGGATGGTA TCTATTATCG CTTTAATCAT ATCGAAGTTG AAGATTAGTA 4 620 

TATACTTTGA ATGGACGAAC CATATAATGA ATCGTTTTTA ATGATCTTTT TATACAAGTT 46 80 

ATGAAGGAGG CTGGGACATT AAGTTCTTAG GCAATGTAAA AAGCTGATTT CTATTAATTA 4740 
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TTTTC CTTAT ATTAATTGCC ATTAATACAA AACCTAGCTC TCGTTTAACT TTATTTATTC 4860 
CTCGAACTGA CATTCGnGTG AACTCAAAAt nGCCTACTTn CTTAAATTAC CAATATCTAT 4920 
(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 626 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY; linear 



IS (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

TGGATTGCCA TTACATGGAC AAGATTTAAC TGAATCAATT ACACCATATG AAGGTGGTAT 60 

CGCTTTTGCA AGTAAACCAT TAATTGATGC TGATTTTATT GGTAAATCTG TATTAAAAGA 120 

TCAAAAAGAA AATGGTGCAC CAAGAAGAAC AGTGGGATTA GAATTACTTG AAAAAGGAAT 180 

TGCAAGAACT GGTTATGAAG TTATGGATTT AGATGGAAAT ATTATTGGAG AAGTAACTTC 240 

AGGAACACAG TCTCCATCAT CAGGAAAATC AATTGCACTT GCAATGATAA AAAGAGATGA 300 

2S 

GTTTGAAATG GGTAGAGAGT TGCTTGTTCA AGTTCGTAAG CGTCAATTAA AAGCGAAAAT 360 

TGTTAAGAAA AATCAAATTG ATAAATAATT AAAAAGGGGT GTGCATTGTG AGTCATCGTT 420 

30 ATATACCTTT AACTGAAAAA GACAAGCAAG AAATGTTACA AACAATTGGT GCAAAATCTA 480 

TAGGAGAATT ATTCGGTGAT GTACCAAGTG ACATTTTATT AAATAGAGAT TTAAATATTG 540 

CTGAAGGCGA ACGGAGAACA ACGTTACTTA GAAGATTnAA TCGCATTGCA AGCAAGAGTA 600 

35 TCACTAGAGG AACGCGTACA TCGTTT 626 

(2) INFORMATION FOR SEQ ID NO: 28: 

T(i) SEQUENCE CHARACTERISTICS: 
40 (A) LENGTH: 1126 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

nGGAAGTGGT GTATATATTT GTAATGAGTG TATTGAATTA TGCTCAGAAA TCGTCGAAGA 60 

SO AGAATTAGCT CAAAACACTT CTGAAGCGAT GACAGAATTA CCTACTCCTA AAGAAATTAT 120 

GGATCATTTA AACGAATATG TTATTGGTCA AGAAAAAGCT AAAAAATCTT TAGCTGTAGC 180 

TGTTTATAAC CACTATAAGC GTATTCAACA ATTAGGACCA AAAGAAGATG ATGTTGAATT 240 
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AACCTTAGCC AAGACGTTGA ATGTACCATT TGCAATTGCA GATGCGACAA GTTTAACTGA 360 

AGCTGGTTAT GTAGGCGATG ATGTTGAAAA TATCTTGTTG AGATTAATTC AAGCAGCTGA 420 

CTTTGACATT GATAAAGCCG AAAAAGGTAT TATTTATGTA GATGAAATTG ATAAAATTGC 480 

ACGTAAATCT GAAAACACAT CTATAACACG TGACGTTTCA GGTGAAGGTG TTCAACAAGC 540 

ATTGCTTAAA ATCTTAGAAG GTACGACTGC AAGTGTTCCG CCACAAGGTG GACGCAAACA 600 

TGCAAACCAA GAAATGATTC AAATTGATAC AACAAATATC TTATTTATTC TTGGTGGTGC 660 

CTTTGATGGT ATTGAAGAAG TGATTAAGCG CCGTCTTGGT GAAAAAGTTA TTGGTTTCTC 720 

15 AAGCAATGAA GCTGATAAAT ATGACGAACA AG CATT ATT A GCACAAATTC GCCCAGAAGA 780 

TTTGCAAGCC TATGGTTTGA TTCCTGAATT TATCGGAOGT GTGCCAATTG TAGCTAATTT 84 0 

AGAAACATTA GATGTAACTG CGTTGAAAAA CATCTTAACG CAACCTAAAA ATGCACTTGT 900 

GAAACAATAT ACTAAAATGC TGGAATTAGA TGATGTGGAT TTAGAGTTCA CTGAAGAAGC 960 

TTTATCAGCA ATTAGTGAAA AAGCAATTGA AAGAAAAACA GGTGCGCGTG GTTTACGTTC 1020 

AATCATAGAA GAATCGTTAA TCGATATTAT GTTTGATGTG CCTTCTAACG AAAATGTAAC 1080 

GAaGGTAGTT ATTACAGCAC AAACmATTAA TGrAGaACTG AACCAG 1126 

<2) INFORMATION FOR SEQ ID NO : 29: 

30 . (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4392 base pairs 
■(B). TYPE: nucleic acid 
<C) STRANDEDNESS: double 
(D) TOPOLOGY: linear 

3S, 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

ATTGACTTCT TAGCAATnAA TaTGAGTGAA GAACGTACTG TTGAAGTACC AGTTCAATTA 60 

GTTGGTGAAG CAGTAGGCGC TAAAGAAGGC GGCGTAGTTG AACAACCATT ATTCAACTTA 120 

GAAGTAACTG CTACTCCAGA CAATATTCCA GAAGCAATCG AAGTAGACAT TACTGAATTA 180 

AACATTAACG ACAGCTTAAC TGTTGCTGAT GTTAAAGTAA CTGGCGACTT CAAAATCGAA 24 0 

AACGATTCAG CTGAATCAGT AGTAACAGTA GTTGCTCCAA- CTGAAGAACC AACTGAAGAA 3 00 

GAAATCGAAG CTATGGAAGG CGAACAACAA ACTGAAGAAC CAGAAGTTGT TGGCGAAAGC 360 

50 AAAGAAGACG AAGAAAAAAC TGAAGAGTAA TTTTAATCTG TTACATTAAA GTTTTTATAC 420 

TTTGTTTAAC AAGCACTGTG CTTATTTTAA TATAAGCATG GTGCTTTTTG TGTTATTATA 4 80 

AAGCTTAATT AAACTTTATT ACTTTGTACT AAAGTTTAAT TAATTTTAGT GAGTAAAAGA 54 0 
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CTTACTAAGC TAAAGAATAA TGATAATTGA TGGCAATGGC GGAAAATGGA TGTTGTCATT 660 

ATAATAATAA ATGAAACAAT TATGTTGGAG GTAAACACGC ATGAAATGTA TTGTAGGTCT 720 

AGGTAATATA GGTAAACGTT TTGAACTTAC AAGACATAAt ATCGGCTTTG AAGTCGTTGA 780 

TTATATTTTA GAGAAAAATA ATTTTTCATT AGATAAACAA AAGTTTAAAG GTGCATATAC B4 0 

AATTGAACGA ATGAACGGCG ATAAAGTGTT ATTTATCGAA CCAATGACAA TGATGAATTT 900 

GTCAGGTGAA GCaGTTGCAC CGATTATGGA TTATTACAAT GTTAATCCAG AAGATTTAAT 960 

TGTCTTATAT GATGATTTAG ATTTAGAACA AGGACAAGTT CGCTTAAGAC AAAAAGGAAG 1020 

15 TGCGGGCGGT CACAATGGTA TGAAATCAAT TATTAAAATG CTTGGTACAG ACCAATTTAA 10 80 

ACGTATTCGT ATTGGTGTGG GAAGACCAAC GAATGGTATG ACGGTACCTG ATTATGTTTT 114 0 

ACAACGCTTT TCAAATGATG AAATGGTAAC GATGGAAAAA GTTATCGAAC ACGCAGCACG 1200 

CGCAATTGAA AAGTTTGTTG AAACATCACG ATTTGACCAT GTTATGAATG AATTTAATGG 1260 

TGAAGTGAAA TAATGACAAT ATTGACAACG CTTATAAAAG AAGATAATCA TTTTCAAGAC 13 2 0 

CTTAATCAGG TATTTGGACA AGCAAACACA CTAGTAACTG GTCTTTCCGC GTCAGCTAAA 13 80 

GTGACGATGA TTGCTGAAAA ATATGCACAA AGTAATCAAC AGTTATTATT AATTAC CAAT 144 0 

AATTTATACC AAGCAGATAA ATTAGAAACA GATTTACTTC AATTTATAGA TGCTGAAGAA 1500 

30 TTGTATAAGT ATCCTGTGCA AGATATTATG ACCGAAGAGT TTTCAACACA AAGCCCTCAA 1560 

CTGATGAGTG AACGTATTAG AACTTTAACT GCGTTAGCTC AAGGTAAGAA AGGGTTATTT 1620 

ATCGTTCCTT TAAATGGTTT GAAAAAGTGG TTAACTCCTG TTGAAATGTG GCAAAATCAC 1680 

35 CAAATGACAT TGCGTGTTGG TGAGGATATC GATGTGGACC AATTTCTTAA CAAATTAGTT 174 0 

AATATGGGGT ACAAACGGGA ATCCGTGGTA TCGGATATTG GTGAATTCTC ATTGCGAGGA 1800 

GGTATTATCG ATATCTTTCC GCTAATTGGG GAACcAATCA GAATTGAGCT ATTTGATACC 1860 

40 

GAAATTGATT CTATTCGGGA TTTTGATGTT GAAACGCAGC GTTCCAAAGA TAATGTTGAA 1920 

GAAGTCGATA TCACAACTGC AAGTGATTAT ATCATTACTG AAGAAGTGAT CAGCCATCTT 198 0 

AAAGAAGAGT TAAAAACTGC ATATGAAAAT ACAAGACCCA AAATAGATAA ATCAGTGCGC 204 0 

45 

AATGATTTGA AAGAAACGTA TGAAAGCTTT AAATTATTCG AAAGTACATA CTTTGATCAT 2100 

CAAATACTAC GTCGCTTAGT AGCGTTTATG TATGAAACAC CTTCGACAAT TATTGAGTAT 216 0 

SO TTCCAAAAAG ATGCAATCAT TGCAGTTGAT GAATTTAATC GTATTAAAGA AACTGAAGAA 2220 

AGTTTAACAG TAGAGTCTGA TTCGTTTATT AG CAAT ATT A TTGAAAGTGG TAATGGATTT 2280 

ATAGGACAAA GTTTTATAAA ATATGATGAT TTTGAAACAT TGATTGAAGG CTATCCTGTC 2340 
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TCATGTAAAC 


CTGTCCAACA 


ATTTTATGGG 


CAATATGACA 


TTATGCGTTC 


TGAATTTCAA 


2460 




CGATATGTTA 


ATCAAAACTA 


TCATATCGTG 


GTTTTGGTCG 


AAACCGAAAC 


TAAAGTTGAA 


2520 


5 


CGTATGCAAG 


CGATGTTAAG 


TGAAAtGCAT 


ATTCCATCAA 


TAACAAAATT 


GCATCGCTCA 


2580 




ATGTCATCGG 


GGCAAGCAGT 


GATTATTGAA 


GGCAGTTTAT 


CTGAAGGATT 


TGAACTACCT 


2640 


10 


GATATGGGAT TAGTTGTCAT TACTGAG CGT GAgcTTTTTA AATCAAAACA 


GAAAAAGCAA 


2700 




CGAAAACGTA 


CGAAAGCTAT 


CTCAAATGCT 


GAAAAAATTA 


AGTCTTACCA 


AGATTTAAAT 


2760 




GTGGGAGATT 


ATATTGTTCA 


TGTGCATCAT 


GGTGTTGGTA 


GATATTTAGG 


TGTTGAGACG 


2820 


1S 


CTCGAAGTGG 


GGCAAACGCA 


TCGTGATTAT 


ATTAAATTGC 


AATATAAAGG 


TACGGATCAA 


2880 




CTATTTGTTC 


CAGTAGATCA 


AATGGATCAA 


GTTCAAAAAT 


ATGTAGCTTC 


GGAAGATAAG 


2940 




ACGCCAAAAT 


TAAATAAACT 


CGGTGGCAGT 


GAATGGAAAA 


AAACAAAAGC 


TAAAGTTCAA 


3000 


20 


CAAAGTGTTG 


AAGATATTGC 


TGAAGAGTTG 


ATTGATTTAT 


ATAAAGAAAG 


AGAAATGGCA 


3060 




GAAGGTTATC 


AATATGGGGA 


AGACACAGCT 


GAGCAAACAA 


CATTTGAATT 


AGATTTTCCA 


3120 


25 


TATGAACTTA 


CGCCTGACCA 


AGCTAAATCT 


ATCGATGAAA 


TTAAAGATGA 


CATGCAAAAA 


3180 


TCGCGTCCAA 


TGGATCGCTT 


GCTATGTGGT 


GATGTTGGTT 


ATGGTAAAAC 


TGAAGTTGCA 


3240 




GTGAGAGCAG 


CATTCAAAGC 


TGTAATGGAA 


GGAAAGCAGG 


TTGCATTTTT 


AGTTCCTACA 


3300 


30 


ACTATTTTAG 


CTCAGCAACA 


TTATGAGACG . 


TTAATTGAGC 


GTATGCAAGA 


TTTTCCTGTT . 


3360 




GAAATTCAAT 


TAATGAGTCG 


TTTTAGAACG 


CCTAAAGAGA 


TAAAACAAAC 


TAAGGAAGGA 


3420 




CTTAAAACTG 


GATTTGTTGA 


CATAGTTGTT 


GGTACACACA 


AATTACTTAG 


TAAAGATATA 


3480 


35 


CAGTATAAAG 


ATTTAGGGCT 


GTTGATTGTA 


GATGAAGAAC 


AACGATTTGG 


TGTACGCCAT 


3540 




AAAGAGCGTA 


TTAAAACATT 


AAAACATAAT 


GTAGATGTAC 


TAACATTGAC 


TGCAACCCCA 


3600 




ATAGCTAGAA CATTGCATAT GAGTATGCTA GGTGTGCGGG 


ATTTGTCAGT 


GATTGAAACG 


3660 


40 


CCGCCAGAAA 


ATCGTTTCCC 


AGTTCAAACA 


TATGTATTAG 


AACAGAACAT 


GAGTTTTATC 


3720 




AAAGAAGCTT 


TAGAAAGAGA 


ACTATCCCGT 


GATGGCCAAG 


TGTTTTATCT 


TTATAATAAA 


3780 


45 


GTGCAATCCA 


TTTATGaAAA 


ACGAGAACAA 


CTCCAGATGT 


TAATGCCAGA 


TGCTAACATT 


3840 


GCAGTTGCTC 


ATGGACAAAT 


GACAGAGCGC 


GATTTAGAAG 


AAACGATGTT 


AAGTTTTAT C 


3900 




AATAATgAAT 


ATGATATTTT 


AGTAACGACG 


ACGATTATTG 


AAACAGGTGT 


CGATGTCCCA 


3960 


SO 


AATGCAAATA 


CTTTGATCAT 


TGAAGATGCA 


GATCGCTTTG 


GATTGAGTCA 


GTTGTATCAA 


4020 




TTAAGAGGTC 


GTGTTGGTCG 


TTCAAGTCGT 


ATTGGTTATG 


CATACTTCTT 


ACATCCAGCA 


4080 




AATAAGGTAC 


TAACTGAGAC 


TGCAGAAGAT 


CGATTACAAG 


CGATTAAAGA 


ATTTACGGAG 


4140 
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10 



20 



25 



30 



TTAGGTAAAC AACAGCACGG CTTTATTGAT ACAGTTGGAT TTGATTTGTA CAGTCAAATG 4260 

TTAGAAGAAG CTGTAAATGA AAAACGTGGT ATTAAGGAAC CAGAATCTGA GGTGCCAGAA 4320 

GTCGAAGTTG ATTTAAACTT GGATGCATAT TTGCCAACAG AATATATTGC AAATG AACAA 4380 

GCTAAAATTG AA 4392 
(2) INFORMATION FOR SEQ ID NO: 30: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 729 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
15 (D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 
TTTCTTTTGA ATCTATATCG AGGTGGTTGG TAGGTTCATC TAAAATAAGT ACATTGTCAC 60 
GTTGCAACAT AAGTAGTGCT AGTTGTAAAC GTGCTTTTTC ACCACCAGAT AAATCATTAA 120 
TTATCTTTTT AACATCGTCT TGTACAAATA AGAAACGTCC AAGAACTGCT CGAATATCTT 180 
TTTCATTCAT TAACGGATAT TGATCCCACA CATAATCTAA AATCGTTTTA CTAGATTTAA 24 0 

ATTCTGCTTG CTTTTGATCA TAATAACCAA TTTGTAAATT TGCGCCGAAA GTAATATCGC 300 
CATTAAGCGC TTTTTGTTGA TTAGCAATAG TTTTAATTAA GGTCGATTTT CCAATACCAT 360 
TTGGCCCAAT GATTGCTATA TGATCGCCTT TAGAGACCTC TATACTCATA GGTTTGGTAA 420 
TTGCAGTTTG ATAACCGATT TCTAAATTTT TTACATGCAT GACGTCATTA CCTGTATTCC 4 80 

35 GGTCAAAGCC AAATTGAATA TTTGCACTTT TGGCATCTAA CATTGGTTTA TCAATGCGTT 540 

CCATTTTTTC TAAAATCTTA CGTCTACTTT TTGCCATTCC ACTTGTTGAA GCACGGGTAA 600 
TATTTTTCTC AACAAAAGTT TCTAATCGTT TTATTTCTGC TTGTTGACTT TCATATTCTT 660 

40 

GCATTCGTTT TTGATAATAT AAATC CCGTT GCTGTATAAA TTCCTCGTAA TTACCAACAT 72 0 

AGCGTTTGA 729 
(2) INFORMATION FOR SEQ ID NO: 31: 

45 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 856 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
50 (D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
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TGATGTTTCG ATACATTTGT TGCACCTTGT GGATATACTT TAAAGGTTGT GTCGTATGTT 120 

TCCTTACTAT CTTTAGCTTC AGATTCCTGT GATTCAACCG TTTTATATTT TTCAAGTGCA 18 0 

TGTCCTTCAA TATCAACTCG TGGAATAATG CGATTCAACC ATGCTGGTAA ATACCACGAA 24 0 

CCTTTtCCAA ACAATTTCGt TAATGCAGGA ATTAACATCA TtCTGACTAC GAAGGCATCA 300 
AAGAGTACAC CAAACGCTAA TGCCATACCC ATTGATTTAA TCATGACATC TTCTTGGAAT " 360 

ACAAACGCAA AGAAGACACT AAACATAATT AATGCAGCTG CTACAATAAC AGGACCGCTT 420 

TCTTTCAATC CTACTTTGAT AGAATAATCA TTATCCCCTG TTTTACTATm yyCTTCATGr 480 

1S ATTCGCGACA TAAGGAAGAC TTCATAATCC ATCGCTAATC CAAATAAGAT ACCTATAGTA 54 0 

ATAACCGGTA AAAATGCTAG CATTGGTCCT GTCGTTTCAA TACCAAACAG ACCTTTCATA €00 

AAACCATCTT GCATTACTAA TGTTGTAAAT CCTAATGTTG CCATTAATGA CAAGACGAAT 660 

20 CCTAAAACTG CTTTTAATGG TATTAGAATT GAACGGAAGA CAATCATTAA TAAGAAAAAT 720 

GCTAATACAA CAATGACTGA GGCAAATAAA GGTATCGCCT CATTTAACTT TTTAGACATA 78 0 

TCAATATTAA TGACACTTTG TCCCGAAATC TCCGTTTTGA AGCCATATTT ATCTTGTGCA 84 0 

25 

TCTTTATGAT AATCTCGTAA ATCATGCACT AAATCATTTG TACTCTCTGC ATTAGGCCCT 900 

TGCTTAGGTA TCACGACCAT CAAAGCGTAA TCATTATCTT TACTCATTTG TGGTGG CGTA 960 

ACGATATCTA CATTTTTCTT ATCTTTAATA TCTTTATATA CAGACTGTAA ATCTTGTTGT 1020 

30 

AATCCTTGTG GATCATCCTT TTTATCTTTC ACATTTATCA ACATCGGTAT TTGGCCATTA 10 80 

AATCCTTCAC CAAATTTATC CGAGATAATA TCGTAAGCTT TTTTCTGTGT AGAATCTGCT 114 0 

35 GGTTTAACAC CGTCATCTGG AATACCAAGT CGCATATGAC TAACTGGTAT TGCAGCTGC? 1200 

ACTAATATGA TTAAACCTAG TAATACTGCC GCAAGTGCAT TTCCTGTAAT AAATTTAGAC 1260 

CATGGCGTAT CAATATCTTT TTTGAATTTA GACTGTAATT TATTCACTTT AATGCGTTtA 132 0 

TGGAAAATGC TTATTAATGC AGGTAATAAA GTTAAAGCGC TAAGTACTGC AAAAACAACA 13 80 

CTAATTGCCG AAGCAAATCC CATTACCGCT AAGAAGTCAA TGCCTACTAA TGATAAACCA 144 0 

CATACTGCAA TTACAACTGT TACACCAGCA , AAAACAACTG CACTACCTGC TGTTCCTATT 1500 

GCAAGACCAA TGCCTTTAAT GTAATCTGTT TCAGTTTTCA TAACTTGTCG ATATCTGAAT 156 0 

AAAATAAATA ATGCATAATC GATACCAACT GCTAGTCCAA TCATTACGGC TAATGTCAGT 162 0 

60 GTGACATTTG GTATATCGAA TGCATAAGTT AACAAACTGA TAATACCTAC ACCAGAGGCT 1680 

AGACCAATCA ATGCACTTAT AATTGGTAAT CCTGCAGCAA TGACTGAACC GAATGTGATT 1740 

AACAGTACAA CAAATGCAAC AATAATACCA ACTAGTTCAG AATTACCGCC TACTTCTGTA 1800 
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AAATGACTTT TAACATTATC TCTAGAGCCA TCTTTTAAAG ATGTTTGACT AACGTCATAT 1920 

GTGATATCTG CAAATGCAGT TGTTTTATCT TTACTAATTT GCTTATTTTC ATAAGGATCT 1980 

5 

GATATTTTAT CAATGTGCTT GTCATCTTTT TTAATATCAT CTAACGTTTT CTTAATATCT 204 0 

TTAGTAATGT TCGGTTGCAC AATACCATCA TCTTTAGTCG TCTTAAAGAC AACACGTATT 2100 

TGTGCCTTTT CACTATCTTG ATTAAAATGT TTTTCAATCT TTTTATTCGT ATCTAACGAC 2160 

10 

TCTAATCCTG TCATTTTAAT ATCATTGTCA AATTTCGGTG CATTTGTAGC AAGTGGTATC 2220 

AATATTGCAG CTACAATCAC TATCCATGCA ATGAC CGCGG ACCATTTATG TTTTGCGATG 2280 

15 AATGTCCCCA TCTTATATAA AAATTTTGCC AAAGTATATT GCCTCCTTTT AAAATCAACG 2340 

TTATAGTTTA AATATACAGT GTAGATTATT GTTCGATTAT AGTATCTATC CCCGACCTCT 2400 

TAAAGAATCA ATTGGAAAAT TTTGTATATT AAACTACACA CAAAGGAGAA ATGTAGATGA 2460 

20 AAGAGACTGA TTTACGAGTT ATAAAGACAA AAAAAGCATT GTCGAGTAGC TTGCTACAAT 2520 

TGTTAGAACA GCAATTATTC CAAACGATTA CTGTCAATCA AATTTGCGAC AACGCACTCG 2580 

TACACCGTAC AACATTTTAT AAACATTTTT ATGATAAATA TGATCTTCTA GAGTACTTGT 2640 

TCAATCAATT GACTAAAGAC TACTTTGCTA GAGATATCAG TGACCGTCTT AAT CATCCAT 2700 

TCCAAAGGAT GAGTGATACG ATTAATAATA AAGAGGATTT GAGAGAAATC GCAGAATTCC 276 0 

3Q AAGAAGAAGA CGCTGAATTT AATAAAGTAT TAAAAAATGT CTGCATTAAA ATT A TG CAT A 2820 

ACGATATCAA AAATAATAGA GACCGTATCG ATATTGACAG CGACATCCCA GATAATCTCA 2B80 

TATTTTATAT TTATGACTCG TTGATTGAAG GTTTTATACA TTGGATAAAA GATGAAAAAA 294 0 

35 TTGATTGGCC TGGCGAAGAT ATTGATAACA TTTTCCATAG ATTAATCAAT ATTAAGATTA 3 000 

AATAGTAGAT GAGAAACTCA TGAGCGTTAC CAACATTCAT AATAAAAACG ATAGTGkACA 3060 

CGTTAATGAA TTCGTGTACT ACTATCGTTT TTTATTTTTA TCGTGCTTAT CGCTATTAAA 312 0 

40 

ACAACTGATA CACAACACAT AAACTATGAA GAAAAAAATA AATCCGCTAT CTAAATGACT 3180 

TTGACTCAGT TGTTTAAATG ACCAAATTGC TAATACAATT CCCATTATTA TTGAAATAAC 324 0 

GTATCTCACA TTCTTATACC TATAATCCTT TTCTAAAAAT ATGGTTGCTA TTACTTAATT 33 00 

45 

TTTAAAGTTA TAAATAAAAA GAGCCAACCG CAATGGATGG CCCTTGTTCA TTATGAAGCA 3360 

TTAGAACATT TCTGAAACAA CCTTTTGTTC TAAGAAGTGT AATAAGTAGT CTGGACTACC 3420 

50 TGTTTTAGCG TCCGTACCTG ACATTTTGAA ACCACCAAAT GGATGGTATC CAACAACTGC 34 8 0 

TGAAGTACAG CCTCTGTTAA GGTATAAATT GCCTACATCA AATT CGTTTA CCGCTTTAAT 354 0 

CCAATGCTCG CGATTATTTG TAATCACTGC ACCAGTTAAA CCGTAATCTG TATCATTTGC 3600 
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TTCTTCTTGC ATGATTCTAT CTTTAGATTT AAGTCCTGAA ATGATTGTTG GTTCTACAAA 3720 

GTAACCTTTT GAATCATCAG TGCCGCCACC TTGTTCTAAT TTACCTTCTT CTTTACCAAT 3780 

CTCAATATAA TTTTTAATCT TATCAAATTG TTTTTTATTA ATAACTGGGC CCATATACGT 3840 

ATTGTCTACA GTATTGCCCA ACGTTAATTC TTTTGTTAAT TTGATTGATT TCT CTAATAC 3900 

TTCGTCATAA ACGTCTTTAT GCACAATTGC ACGTGAACAT GCTGAACATT TTTGACCAGA 3960 

AAAACCAAAT GCTGACGTTA CAATAGCTTC TGCTGCCATA TCTGTATCAA TATTTTCATC 4020 

AACTACAATG GCATCTTTAC CACCCATTTC AGCGATAACA CGTTTCAAGA AGTTTTGACC 4080 

1S TTCTTGAACA ACGGCACTAC GTTCATAAAT TCTAGTACCT GTCGCACGTG ATCCTGTAAA 4140 

TGTAACGAAA TGCGTATCTT TATGATCAAC TAAGTAATCA CCAATTTCTT TCGGATCACC 4200 

AGGAACAAAG TTAACTACGC CTTTTGGTAA TCCTGCTTCT TCTAAAATTT CCATTAATTT 4260 

20 ATAAGCGATA TAAGGTGTAT CCTCAGCAGG TTTCAATAAC ACTGTATTAC CTGCCACAAC 4320 
TGGTGCTAAA GTTGTACCAG CCATAATCGC AAACGGGAAG TTCCACGGCG GAATTGTAAC . 4380 

ACCTGTACCA ATTGATTTAT AGAAATATTT ATTGTGTTCA CCTTCACGAT CAAGTACTGG 4440 

CTTACCTTGA GCCAAGTCCA TCATTGAACG TGCATAGTAT TCAATAAAAT CAATAC CTTC 4500 

AGCTGCATCA CCAACTGCTT CATCCCATGG CTTACCTGCT TCATAAACCA TAATTG CTGC 4560 

AATTTCCGCT TTTCGACGAC GAATAATTGC CGAAACACGT AACATAAGCT CTGCACGATC 4620, 

ATTTGCTGAC CATGTTTTCC AAGATTTATA AGCTTCGTTT GCTGCTTTAA ACGCATCTTC 4680 

AACATCTTGT TTTGTTGCCT TTGATGCATT TGCAATCACT TGTGATGTGT CTGCAGGATT 474 0 

GATTGATTTA ATTTTGTCAT CTTTGAAAAT CTTCTCTCCA TTAATCACTA ATGGTATGTC 4 800 

TTGACCTAAT TCTTTTTCCA CGTCTTTCAA TGCTTTCTTA AACATATCCA CATTTTCTTG 4860 

GACTGAAAAA TCGTAACCAG GTTCATTTTT AAATTCTACT ACCATGTACA CTTACCCCCT 4920 

ATAAATTTTG AAAGTGGTTT AACCCTTTGA TTTAATGATA TAACATCATT TAAACTCATT 4 980 

TTACTATGAT TAAGGTTAGT TTTGCAATCG CTTTCATTTT TATGTTTTAT CACTTATTCT 504 0 

CAAGTATTTT GAAATTGATT GGTTACTTTT TAAAATTTAT ATGGGTCGCA ACTGCTACTT 510 0 

TATCGTTTCG TCATTTAATG TTTCGGATGG TAGGTCATTA TCAATTTTAC GAACGACTTT 516 0 

ACAAGGGTTT CCAACCGCTA AGCTGTGTGG CGGAATATCT TTAGTGACAA CACTACCAGC 522 0 

ACCAATCACA CTGCCTTCTC CAATCGTCAC CCCTGGTAAC ACGGCTACAT GACCGCCAAA 528 0 

CCAAGTATTA CTGCCAATAT GAATGGGTGC GGCTTTTTCA AAACCTTCAT TTCTATGATG 5340 

GAAATTAAGT GGATGTGTCG CTGTGTAGAA TCCACAATTA GGTCCTATAA AAACATTATC 5400 
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TCCTAGTTTA ACGTTCCAAC CATAATCTGT ATCAAAAGGA ATCGAAATAC TTACATTGTC 552 0 

TGTTGTTGTT TGAAATAATT GATCAATTAA TTCCTTTCTT TTATTTGTAG CACTCGGTCT 55 8 0 

5 

TGTATGATTT AATTCAAAGC AAATATCTTT CGCTCGTGCA CGTTCATTGA TTAAGTATTG 5640 

ATCAAAGTTT GCATCGTACC ATTTTTCTGC TAACATTTTT TCTTTTTCAG TCATTACACC 5700 

TTTCAACTCC TAATAACTTA TTTACTTGTT TAAAAGTTAA TCAAATAAAC CTTCGCCTAT 5760 

10 

GCAACTAATA CGCTATAACA TTATGAAATC ATGACCTTAT CACCCTTATC TATACAATTC 5820 

TCGCATCAAA TACTGCTAAA GTAGTAGATA AATTCAATAC TACAGACGCA TTCATTTTTT 5880 

15 AATCTATTAA CGTACAATGT GAGTAAGAGA AATATAAAGG AGTATGATAG CGATGAGAAT 5940 

ATTAATTACA GGCACAGTTG CTATCTTAAT CATTCTAGGT TTGGTCAAAA CGATACAAGA 6000 

TTACGAAATG ACAAACGACA CGAGTCGTcA GTTGTCAGAC AACAAAGATG ATGATAAAGT 6060 

20 

CATCCATCTT AATAATTTTA AAAATTTACA TGCGAAAGAA TTTAACCCAT CTGATTTCTT 6120 

TTAAGTCACC TAAGAATTGC AAATC CAGAA GTCATTTAAG TTTTACCTTT CATTCATACA 6180 

TCCTTTAATA TTAATTACGA CTTCTTTTAT ATAGATGCTA AGTAGAGAGA TTGTTGTGCA 6240 

25 

ATGTTTGCAC GGCAATCTCT CTTTTTCTTT TTAAAATTGG TAAAAGTAAA ACGCAACGAT 6300 

* * TGACTTATAT ACCTATAGGG GGTACATTAG ACGTGTAACA ATGAATCACA GGGAGGCAAT 6360 

30 • AATGTGGCTA ATACGAAAAA AACAACATTA GATATCACTG GTATGACTTG TGCCGCATGT 6420 

TCAAATCGTA TCGAAAAGAA ACTGAATAAA CTTGATGACG TTAATGCCCA AGTGAATTTA 64 80 

ACTACAGAGA AAGCAACTGT TGAGTATAAC CCTGATCAAC ATGATGTCCA AGAATTTATT 654 0 

35 AATACGATTG AACATTTAGG TTACGGTGTC GCTGTAGAAA CTGTCGAATT AGACATTACA 6600 

GGTATGACTT GTGCTGCATG CTCAAGCCGT ATTGAAAAAG TGTTAAATAA AATGGACGGC 6660 

GTTGAAAATG CAACGGTCAA TTTAACAACA GAGCAAGCTA AAGTTGACTA TTATCCTGAA 6720 

40 

GAAACAGATG CTGATAAACT TGTCACTCGC ATTCAAAAAT TAGGTTATGA CGCGTCTATT 6780 

AAAGATAACA ATAAAGATCA AACGTCACGC AAAGCTGAAG CGCTACAACA TAAATTGATT 6840 

AAGCTTATCA TATCAGCAGT ATTATCTTTA CCACTATTAA TGTTAATGTT TGTACATCTT 6900 

45 

TTCAATATGC ATATACCAGC ACTATTTACG AATCCATGGT TCCAATTTAT TTTAGCTACA . 6960 

CCTGTACAAT TTATTATTGG ATGGCAATTT TATGTAGGTG CTTATAAAAA CTTAAGAAAT 7020 

SO GGTGGCGCCA ATATGGATGT ACTTGTTGCT GTTGGTACAA GTGCAGCATA TTTTTACAGT 7080 

ATTTATGAAA TGGTTCGTTG GCTAAATGGC TCAACAACGC AACCGCATTT ATACTTTGAA 7140 

ACAAGCGCCG TACTAATTAC CTTAATCTTA TTCGGTAAGT ATTTAGAAGC TAGAGCGAAG 7200 
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TTAAAAGATG GTAATGAAGT GATGATTCCT CTAAATGAAG TACATGTTGG AGATACACTT 7320 

ATCGTTAAAC CAGGTGAAAA GATACCTGTT GATGGCAAAA TTATTAAAGG TATGACTGCC .73 80 

ATCGACGAAT CTATGTTAAC AGGTGAATCT ATCCCTGTTG AGAAGAATGT TGATGATACT 7440 

GTAATTGGTT CAACGATGAA CAAAAACGGT ACTATTACTA TGACAGCAAC AAAAGTTGGC 7500 

GGGGACACTG CGTTGGCAAA TATTATTAAA GTTGTCGAAG AAGCTCAAAG TTCTAAAGCG 7560 

CCGATTCAAC GATTGGCAGA TATTATTTCT GGTTATTTCG TTCCTATCGT TGTTGGTATC 762 0 

GCACTATTAA CATTTATCGT GTGGATTACT TTAGTTACAC CAGGTACATT TGAAC CTGCA 7680 

1S CTTGTTGCGA GTATTTCCGT TCTCGTCATT GCTTGTCCAT GCGCATTGGG ACTTGCTACA 774 0 

CCAACTTCTA TTATGGTAGG TACTGGTCGC GCTGCTGaAA ATGGTATTTT ATTTAAAGGT 7800 

GGCGAGTTTG TTGAACGCAC ACATCAAATT GATACCATCG TTTTAGATAA GACGGGTACC 7860 

20 ATTACAAATG GTCGTCCAGT CGTGACAGAT TATCATGGTG ACAATCAAAC GCTACAACTA 7920 

CTTGCTACTG CTGAAAAAGA TTCTGAACAC CCATTGGGAG AAGCCATTGT CAATTATGCA .7980 

AAAGAAAAGC AATTAATATT AACTGAGACA ACAACATTTA AAGCAGTACC TGGCCATGGT B040 

ATTGAAGCAA CGATTGATCA TCACCATATA TTGGTTGGTA ACCGTAAATT AATGGCTGAC 8100 
AATGATATTA GCTTGCCTAA GCATATTTCT GATGATTTAA CACATTATGA ACGAGATGGT 1 8160 

AAAACTGCTA TGCTCATTGC TGTTAATTAT TCATTAACTG GTATCATCGC AGTGGCAGAT 8220 

ACTGTCAAAG ATCATGCCAA AGATGCTATA AAACAATTGC ATGATATGGG CATTGAAGTT B280 
GCCATGTTAA CTGGCGATAA TAAAAACACT GCTCAAGCCA TTGCAAAACA AGTAGGCATA 1 * - 8340 

GATACTGTTA TTGCAGATAT TTTACCAGAA GAAAAAGCTG CACAAATTGC GAAACTACAG 84 00 

CAACAAGGTA AGAAGGTTGC GATGGTTGGT GACGGTGTAA ATGATGCACC TGCATTAGTT 84 60 

AAAGCTGATA TCGGTATCGC CATTGGTACA GGTACAGAAG TTGCCATTGA AG CAG CTGAT 8520 

ATTACTATTC TTGGTGGCGA CTTGATGCTT • ATTCCTAAAG CCATTTATGC AAGTAAAGCA 8580 

ACCATTCGTA ATATTCGTCA AAATCTATTT TGGGCATTCG GCTATAATAT TGCCGGTATC 8640 

CCTATAGCTG CATTGGGCTT ACTTGCGCCA TGGGTTGCTG GTGCTGCAAT GGCACTAAGT 8700 

TCAGTAAGTG TTGTCACAAA CGCACTTAGA TTGAAAAAGA TGCGATTAGA ACCACGCCGT 8760 

AAAGATGCCT AGATTCCTTA ATAATGAAGG ATTCGTTGGT GATTCTGAGA TAGGCTAGTG 8 820- 

ATTGGCTCTA TAATGTCGCG GTTTAyaGTt. GGATCTTCGC TCCAACTGCA TATATAGTnA 8880 

CACTTTTCGC TTGGCGAATT AGTGTATCTT ACCTAATAGc TCCGCCTATT AGGTTCCATC 894 0 

ATTATTATAA ATAATAAGTA CACTACGGt T TACAGTTGGA TCTTCGCTCC AACTGCATAA 9000 
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GAAATTTTAA ATGTTGAAGG TATGAGCTGT GGTCACTGCA AAAGTGCTGT TGAATCTGCA 9120 

TTAAATAATA TTGACGGTGT CACTTCAGCT GACGTTAACC TTGAAAATGG TCAAGTAAGT 9180 

5 GTTCAATATG ATGACAGTAA AGTTGCTGTA TCTCAAATGA AAGACGCAAT TGAAGATCAA 9240 

GGTTACGATG TCGTTTAATT AGGCAATATT CAACGTCATC AACACCAAAT TAAAAAATCG 9300 

AACTGATGAG AATCCCAACA ATCCAAATTA TCTCATCAGT TCGATTTTTA ATTTACTCGT 9360 

10 

AACCTAGTAT CTCCAGTCTG CAATACATCT AATGTTGCAT CTAATGCATC GACAATTAGA 942 0 

TTTTTAACTG CAGCTTCAGT ATAAAAGGCA ATATGTGGTG TTAATATGAC ATCTTCCCTG 94 8 0 

,5 TCAATCAACG ATTCTAACAA TGGATCGTTC AGTGTTTTGC CCCTTTGATC ACTTGGGAAA 9540 

AGTTTGCGTT CAAATTCATA CGTATCAAGT GCTGCACCTT TAATCACACC ATTGTCTAAT 9600 

GCGTCTAATA ACGCCTTAGT ATCTACTAAA GAACCTCTCG CACAATTGAC AAATACTGCG 966 0 

20 CCCTTTTTAA AATGTTTAAA TAATTCAGCA TTAAATAGAT AATGATTATA TTTCGTTGCA 9720 

GGTACATGTA ATGTCACGAT ATCAGCACCT TCAACCGCTT CCTCAATCGT ATCTTTGTAA 9780 

TCGACATACG TTGCAATTTT AGCATTAGGA AACGGt CGTA TGCGAC CACA TCACTTTGAT 9840 

25 

AACCATTGGC AAATATATCG GCTACTACAC GGCCAATTCG ACCTGTACCA ATAACAGCTA 9900 

CTTTTAAATC TTTAATGGAT TTCGATAAAA TAGTAGGTTC CCATCTAAAA TCATGcTCCC 996 0 

GCACTTTCGT TTGAATTTGA TTAAAATGAC GAACCACATT AATAGCCTGG TTCACAGCAA 10020 

30 

ACTCCGCAAT TGAATTCGGA GAG TATGACG GCAGATTTGA CACAATAAAG TTATACTTGT 1008 0 

TTGCTAACTC CAAATCATAT GTATCAAATC CAGCACTACG TTGTGCGATT " TGTTTAATAC 1014 0 

35 CTAGTTCATT TAATCGTTTA TAAACATGCT CTGATAATGG TATTTGTTGT GATAGCGATA 1020O 

AGCCATCATA ACCAGCGACA CCTTCAAGAT TGTCATCAGT TAATGCTTCT TTAGTAATAT 10260 

CTACeTCAAC ATGATGTTTC TCTGCCCACG CCTTGATATA AGGCATATCT TCATCACGTA 10320 

40 CACTCATGAT TTTAATTTTT GTCATTTTAA CATCACCCTT AACTTTATTA TTCATATAAA 10380 

TATGCTAGTT CTGTTAATGT TATTGCAGCT TCGTCTAATT TCTGGTCATC TAACGCCAAT 10440 

GAAATTCTCA CATAACGATT ACCATTGTCT CCAAATGGTT TCCCTGGAGC AACAAGTATT 10500 

45 

GACTTCTCTT GCACTAAAAA TTGCTCAAAT TGCTCGCTGT CATAACCAGG CGGTGTTTCC 10560. 

AACCATACAT ATATGCCACC TTTAGCATGA ACAAATGGCA AATCAGCTTT TGCAAGCATG 10620 

So GCTTCGAATC GGTCACGACG TGTTTTAAAT ACATTGCTTT GTTCTTCTAA AAAATCATCA 106 80 

TAATGATTCA AAGCATATAT TGCGGCATCT TGTAATGCAC CAAAGATCCC AGGATTTGTG 10740 

TGCGTTTGGT ACTTTTTCAA AGCTTGAATC ATATCTTTAT TACCAACTGC AAAACCGACT 10 800 

55 



313 



EP0 786 519 A2 





CCATTTTCCG 


AAGCAAGTAT 


ACTAGGATTT 


TTAGCGTCGA 


AACCGAAAGC 


ACCATAAGCA 


10920 




AAATCATGCA 


CGATTTTAGT 


GTCTGTACCT 


TTAAATTTAG 


CTATCGCTTC 


ATCAAAAACT 


10980 


s 


TCTTTCGTAG 


CTGTCGATCC 


AGTTGGATTA 


TTTGGATACG 


TTAAATAAAT 


GAGTTTTGTT 


11040 




TTATCTATTA TTTGTGAATC 


AACTTTGGAC 


CAATCTGGCA AATAATGTGG 


CGGTTCTAAA 


11100 


10 


. TTAAGCGGGA 


CTGGCTTGCC 


AT CAG CT AAA 


AGTACACCTG 


CTAAATAATC 


CGTGTAGCCT 


11160 


GGATCAGGTA 


G T AAT ACAT A 


GTCTCCTGGA 


TTGATAACAC 


ATGTTGGTAC 


TGCCACTAAT 


11220 




CCATTTTTTG 


TACCATATAA 


AATGCATACT 


TCATCTTCTT 


TATCTAAGGT 


CACATTATAT 


11280 


IS 


TGTCTTTGAT 


AAAAATCTAC 


AATAGCTTGC 


TTGAACGCTT 


CTTTACCATG 


AAAAGCACCA 


11340 




TATTTTTGAT 


TTTCAGGAAT 


AGTTAGTGCT 


TTTTGAAAAT 


GATCAATAAT 


ACCTTGTGGC 


11400 




GTGGGCCCAT 


CAGGGATTCC 


AACTGCCATA 


TTAATTAATG 


GCAATGGTCC 


ATGTTCGATT 


11460 


20 


TTACGTCCCA 


TCGTTTTCCC 


GAAATAACTA 


TCAGGGATAT 


TTGCTAATTT 


GTTAGAGATC 


11520 




ATCAAATTCC 


TCCTCTATCA 


TTAAACATAG 


CCTGGGCGAC 


TATCATAATC 


CTAACAACTT 


11580 


25 


GTATCACTCT 


CATTTAGATG 


GTTACAATGA 


CATCGCCATT 


CACCGTTATG 


TTCAACAGAA 


11640 


CTTATGACAC 


ACGTTGTATT 


GAATGAATTT 


ATTTTCATTT 


TAGGTAGGTA 


TAATATTATT 


11700 




GTCAATATTA 


GGAATTTTCA 


GATTAATATG 


CACTCAATCG 


TTATGATTTA 


ACTGTCATGC 


11760 


30 


ATATCCGCAT 


GCGCAACCAG 


TTAGATATGC 


TT AT AT AAAG ITATAACGCCC 


ATCAAGGTAC 


11820 




GTATTCAAAC 


GTGAACCTTA 


ACAGGCGTCA TTCATTGTTA 


AATAAAACTT 


CTTAAG CACA 


11880 




TACTTATTTC 


ACTATGGCTT 


TTACGTTCCC 


CTTATACTTT 


TCTCACATCT 


TTCTCTTAGA 


,11940 


35 


CTACTCCCTT 


ATACGCCCCG 


CTCAATATCT 


TTAATCATTT 


CATCTACAGT 


TATTTTCGCA 


12000 




CTCGTTAAGA 


CAATAGGAAC 


GCCTGCACCT 


GGATGCGTAC 


TTGCACCTGC 


AAAATATAAA 


12060 




TCTTTATAAT 


CTCGCGATAC 


ATTTTGTGGA 


CGATAATAAT 


TACTTTGCGC 


TAAAGTTGGC 


12120 


40 


ATTAAACCGA 


ATGCCGAACC 


AAATTTCGCA 


TGATACGTTT 


GCTCAAAATC 


ATTTGGCGTA 


12180 




AAGATTGTTT 


CTGAAACAAT 


ATGCGATTTT 


ATATCTTCAA 


ATACTTCAAT 


CGTTGCTAAT 


12240 


45 


TTACGATAAA 


TAATTTCCTT 


TATTTGTTGC 


GTCAAAGCTT 


CATCTGACCA 


ATCGATTCCG 


12300 


CTACCTGTTT 


TAAGTT CCGG 


CGTCGGCATT 


AGCACATAAA 


TACCAGTTTT 


GCCTTCTGGC 


12360 




GCAAGTGATT 


TATCAGCGAC 


CGCTGGTACA 


TACACATAAA 


TAGAAGGATC 


ATATGATAAA 


12420 


SO 


CGTCCCTCAA 


ATATTTCTTC 


AATATTGCCT 


CTAAAGT CAT 


CTGAAAAAAT 


AACATTATGA 


12480 




AGTCTCACTT 


GATCTGTCAC 


ATCAATATCT 


ATACCGATAT 


ACATTAAAAA 


TGCTGAACAA 


12540 




GAGTAATCTA 


AGTCTGCAAT 


TTTATGTGGT 


GGATACTTTT 


TAATAGGTGC 


AAAATCTGGC 


12600 
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is 



20 



30 



35 



40 



45 



ATGTCAC CAT TCACTTTTAT CGCATCGGCC CGTTTGAATT TAGGATCAAT AATAATTTGC 12720 

TCAATTTCAG CATTTAGTTC AATATTAACG CCTAAGTCTT TATTTAATTG CGCTAGcCCT 12780 

TGAGCCATGC CATACATACC GCCTTTAATA AAATGCACAC CAAACATCAt TTCAATCATA 1284 0 

GGAATAATTG AATATAGTGA CGGGCCTCGT TTTGGATCAA TTCCTATGTA TAACGTTTGA 12 900 

AACGCTAAAA G CTTTTGT AT CTTTTCGTTA TCAATATAAT GTTCAATTAG CTGATCTGCA 12 960 

TGATTTAACG TTTTTAACTT AGCACCTTGC ACAAGTGACG TCATATTATA AAAGTCACTC 13020 

GGTTTGCGAT ACGTTCTTTC TAAGAAATAG CGACGTGCAA TTTCATATTT TTTATAAACA 13080 

TCCGTTAAAA AGGACATAAA ACCATGCGTT GAACCAGGTT CTATACTTTC TAGCATTTGC 1314 0 

TGTAATTCAG CTAAATCTGT AGGCACCGTT ATACGATCAT CGTGGTCAAA ATACACATCG 13200 

TAAATATAAC GTAATTGTCT CAATTCAATA TAATCTTCAT AATTTTTACC ACACGCTGTA 13260 

AAAACATCTT TATAAACATC TGGCATCATG ACAATTGTGG GACCCATATC AAATGTAAAG 13320 

CCGTCTTTCT TTAATTGATT CATACGCCCG CCTACATTAT TATTTTTTTC AAATATCGTC 13380 

ACTTCATGAC CTTGAGAAGC AATACGGGGT GCCGCTGCTA ATCCTGTGAG ACCTGCACCA 1344 0 

ATT ACTGCAA TCTTCATTAT TCAACCACCT ATATTCTATG ATATTTACTA TTTATTTCAT 13500 

GAAACAACTT TGCCTTTTTC CTCTTATCCA CAAAAACACG TTCATGTAAT GTATAGTTAG 13560 

CCTGtCTCAC TTCGTCCAGT ATTTCAATAT ATATACGTGC TGCTAATTCT ATGATTGGTT 13620 

GTGCTTCAAT ACTAAATACT TTGATTTGAT CCATAACATC TTGAAAATCT TTTTCTGCGA 13680 

TAGCTG CATA ATATTCCCAT AAGTCAATAT AATGATTATT AACACCATTT TGGTACACTT 13740 

CAGCAATATC AACTTCATAT TGCTTTAATC GTTGCTTACT AAAATATATC CGTTCATTGT 13 800 
CAAAATCTTC ACCGACATCT CTTAATATAT TAAnGGGATC CTCTAGAGTC GACCTG - 13856 
(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1008 8 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



so 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

ATATATAAAT ATAGATTAAG TATATAGATT AATCAACTTT TTTGGAAGAG CAAATCACGC 60 

AATCAACAAA TAATATAAGA AGTTTTTGCG ATAGTTTTAA AATAGCTGTA ATAGAATACT 120 

AAATGTGACA AACTTAGAAC TAATATCAAG TGTTGATGTT TTGAATATAA AAATGCTAAT 180 
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ATAATTGGTT AATATATGAG TAATTAGAAA ATAGACAAAG GATGACGATT TATGTATATC 3 00 

AATATGAAAG ATTATGGGTT AACAGGCATA AACAAAACTA AAGATACTCG AGCAATACAA 3 60 

CGTGCGTTAA ATCGTGGAAG ATGTAAACCA ACGACAGTTT ATATACCGAA AGGGACGTAT 420 

GATATTTGCA AACCATTAAC GATATATGGC AATACAACAC TTTTGTTAGA TAATGAAACT 4 80 

ATTTTACGCC GATGTGATTC TGGTCCTTTA TTAAAAAATG GTCGTCGCTT TGGTTTTTaT 540 

CGTGGTTATA ATGGACACAG TCATATTCAT ATTAAAGGCG GCAAGTTTGA TATGAATGGT 600 

GTATCGTATC CTTATAACAA TACAGCTATG TGCATTGGGC ATGCTGAAGA TATTCAATTA 660 

ATAGGTGTGA CCATTAAGAA TGTAGTGAGT GGTCATGCAA TTGATGCTTG TGGGATTAAC 720 

GGACTCTATA TTAAAAGCTG TTCATTTGAA GGATTCATAG ACTATAGTGG CGAACcTTTT 780 

ATTCTGAAGC AATACAATTA GACATTCAAG TACCTGGTGC TTTTCCAAAA TTCGSAACgA 840 

20 CAGATGGTAC GATAACGAAA AATGTCATTA TCGAAGATTG TTATTTTGGA CCTTCAGAAT 900 

TGCCCGAAAT GGGAAGTTGG AATCGTGCTA TTGGCTCACA TGCAAGTAGA CATAATCGAT 960 

ACTATGAGAA TATTCATATT AGAAATAATA TATTTGAAGA TATACAAGGT TATGCATTAA 1020 

CTGCCTTGaA GTATAAAGAT GCTTTCATTA . TTAATAATAA GTTTATTAAC TGTGaGGGTG 1080 

GCATTAGATA TTTAGGAGTT AGAGATGGTA AAAATGCAGC, AGATGTGaTG ACAGGaAAAG 1140 

ACTTAGGTTC CCAAGCAGGC ATAAATATGA ATATAATTGG AAATGAATTT AAAGGATCAA 1200 

TGTCTAAAGA TGCGATACAT GTACGTAATT ATAATAATGT TAAACATAAA GATGTATTAA 12 60 

TCGTTGGGAA TACATTCAAT AATTCGACTC AATCAATTCA TTTAGAAGAT ATTGATACAG 13 20 

35 TGTTTTTAAG TCCTGTTGAA GCGGGTATTC AAGTTACTAC AATCAATGTA GATGAAATAA 13 80 

AAAAGTAAAA AGTTTCGCAT GACATTAGGA TTAAGAATAG TAGATAATTT TTGAAAGCX3C 1440 

ATTSATAAAA CGGTATAAAT ATGCTATAAT AAACC CAATT ATCTGATAAA AGGGGTATTT 1500 

40 TGACGGTAAT GATAATACAA GATAGACAAC TTTCTATACT CTAATATAGT GAGTTGAAGT 1560 

AGCTTGTCAT AATCATCATG AGGGGGAAAT TTATGG CTTA TTTCAATCAA CATCAATCAA 1620 

TGATATCGAA AAGGTATTTA ACATTCTTTT CAAAATCAAA GAAAAAGAAA CCGTTTAGTG 16 80 

45 

CGGGACAACT TATTGGACTA ATATTAGGTC CATTACTTTT CCTATTAACA TTATTATTCT 174 0 

TTCATCCACA AGACTTACCT TGGAAAGGCG TCTATGTTTT AGCGATTACT TTATGGATTG 1800 

so CGACTTGGTG GATTACTGAA GCAATTCCTA TTGCAGCAAC GAGCTTATTA CCAATTGTGT 1860 

TATTACCATT AGGTCATATA CTTACACCAG AACAAGTATC ATCCGAATAT GGCAATGATA 192 0 

TTATCTTTTT GTTTTTAGGT GGATTTATTT TGGCAATTGC AATGGAAAGA TGGAATTTAC 1980 
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TTGGATTCAT GGTGG CAACA GGATTCTTAT CTATGTTTGT ATCGAACACT GCAGCTGTAA 2100 

TGATTATGAT TCCGATTGGT TTAGCAATTA TTAAGGAAGC ACATGATTTA CAAGAAGCGA 2160 

5 ATACGAATCA AACAAGTATT CAAAAGTTTG AAAAATCTCT AGTTTTAGCA ATTGGCTATG 2220 

CAGGTACGAT TGGTGGCTTG GGTAGATTAA TCGGAACCCC GCCATTAATT ATTTTAAAAG 2280 

GACAATACAT GCAACATTTT GGACATGAAA TTAGTTTTGC TAAATGGATG ATTGTAGGGA 2340 

10 

TTCCAACGGT CATTGTTTTG TT AGG T ATT A CTTGGCTCTA TTTAAGATAT GTTGCGTTTA 2400 

GACATGATTT GAAATATTTa CCTGGTGGTC AGACGTTAAT TAAACAAAAG TTAGACGAGC 2460 

15 TTGGCAAAAT GAAGTATGAA GAAAAGGTAG TACAAACTAT CTTTGTACTT GCTAGCTTAT 2520 

TATGGATTAC AAGAGAGTTT CTTCTGAAAA AATGGGAAGT TACGTCATCT GTTGCAGATG 2580 

GTACGATTGC TATTTTTATA TCAATATTAT TATTTATTAT TCCAGCTAAA AATACTGAAA 2640 

20 AACATCGCCG TATCATTGAC TGGGAAGTTG CAAAAGAGCT CCCTTGGGGT GTATTAATTT 2700 

TATTTGGTGG CGGTTTAGCA TTAGCGAAAG GTATTTCTGA AAGTGGTTTA GCAAAATGGT 2760 

TAGGCGAACA GTTGAAATCA TTAAATGGTG TTAGTCCGAT TCTTATTGTA ATTGTCATAA 2820 

25 CAATCTTTGT CTTATTTTTA ACTGAAGTGA CATCTAATAC TGCAACTGCA ACGATGATTT 2880 

- TACCGATTTT AGCAACGTTG TCTGTTGCTG TTGGAGTGCA TCCATTACTA CTTATGGCAC 2940 

CTGCAGCTAT GGCGG CTAAC TGTGCATACA TGTTACCAGT AGGGACACCA CCGAATGCAA 3000 

30 

TTATCTTTGG TTCTGGTAAA ATATCTATCA AACAAATGGC ATCAGTAGGA TTCTGGGTAA 3 060 

ACTTAATCAG TGCAATAATT ATTATTTTAG TCGTGTATTA TGTAATGCCT ATAGTTTTAG 3120 

35 GTATTGATAT AAATCAACCA CTGCCATTGA AATAGTAATT GCAGATTAGA ACGAAAAATA 3180 

AAAGGTTACA TTAGCAATTG CTTGGACGAG TGGTAACGAA ACGTATACCG CAGCATCGTG 324 0 

TAASAACAAT ACAAACAAAA GAAAGTCAAC CAAGGATGGA TTC CTATTTT AATCCTTGGT 33 00 

40 TGACTCTTTA TTTTATTTAA ATTGTAGAAC CTAGAAAATA AAGTTTAATT AAAAGCACCA 3360 

ATCATTTCTA CTTTGAAATC TAAGGTTTCT AAAATAGCAA TGACTTTCTT TATATCGGTT 34 20 

GTAATTGCAG AATCAGCCTG AACGAAAAAT CGATACATAC CTAATTGTGT TTTTAAAGGA 34 80 

45 

CGAGACTCAA TCCAGGATAA ATTAATATTA AACAAAGCAA ATGTATTAAG CACACTTGCT 3540 

AACAACCCAG GTTTATCATG CATTGGTGTA ATTAAAAACA TCAATGATGT CGCATTTTGA 3600 

TCAAATTGCT GCTGATTTTT TATAACTAAA AAACGTGTCA CGTTATGTGG ATAGTCTTCA 3660 

SO 

ATATGTGTAT CAATAGGTGT AAAACCATAA GctTCGCCAC TACCTAAAGG TGCAATTGCT 3720 

GCAACGCCAT TTTCAATTTT AGTCAAACTT TGAATTGTAC TGTCGACATA ATCATAGTCA 3780 
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TTTTTAATAT CAGAAATGGA ATCTGTTCCA TTACCATATA ATGCAAAGTT AATATCTAAA 3 900 

CGTATTTCAC CGTGTGCAAA GAGATCTTGC TGTGCAAGTG CATCTGCCAC AATGTTGATT 3960 

GTTCCTTCTA TAGAATTTTC AATAGGGACA ACACCAATCG ATGTGTCATC ATCTGCAACT 4020 

GCCTTGATGA CTTCAAATAA ATTTGACTTT GGTTGAAAAG TTGCTTCATT TTCAGAAAAA 4080 

TACTGACGAC AAGCCAAATA TGAAAATGTA CCTTTAGGGC CTAAATAATA TAATTGCATA 4140 

TGCTACACCT CTAQTAACTT AATGATGGAA AGGGCACTGG TTAGCATTTG ATTCTTTCTT 4200 

TTTATAGAAA AAGTTTGGAT CTTTTACTGT ATTGTCATAT CCGTGATGAT AATTTGACGT 4260 

15 CAATGTTGGA GATAATGGCG GTGCTAGCCA AGACCATTTT CCGGTAACTT GACGAC CTTG 4320 

TTGTG CTTCG TFACGTTCGA ATAGTTCGAA TTGCTTTGCA GCGGTCAAAT GATCGACAAT 4380 

TGATACGCCT TCTTTTTTAA AGGAATGATA CACAGCATAG TTCAATTCAA CAAGTGCTCG 4440 

20 ATCTTTATTA AATGAATTAT TTTTAAGTGT ATCAAATTCA AACGCAT CTG CAACTTTTTC 4500 

TAGTAAATTG TAACGGTAAT CATCAATAAA GTTACGTACG CCAATTTCAG TTACCATATA 4560 

CCAACCGTTA AAGGGTGCAG TTGGATATAC AATGCCACCG ATTTTTAAGT CCATATTGGA 4620 

25 

AATGATAGGG ACTG CATACC ATTTTAAGTT CAATTTTCTT AATTTTGGAT AATGATTATG 4680 

TTCAATAGGT ACTTCTTTAA TTAATGAAGT AGGATATTCG TAAAATTTAA CTGACTCATT 4740 

3Q AGGTAATTGG TAAATCAGTG GTAACACGTC AAAATTAGTA CCTTTTCCTT TCCAACCTAA 4800 

GTGATTTGCT AAGCGTGTAA CTTCTTTTTC AGCAGGATCA CCACAATTGT CATAGCCAGC 4 860 

ATAGCGAATT AATTGATTGT TGAAAATTTT AGGTCCATCC TTTGGAGCAT ATATAGTAAT 4 920 

3S ATACGGCTTT AATTTACCTT CATTTGTAGC CTGTGTAATA TGATAAGTAA TTGATGATAA 4 980 

GAACGATGCT TCGTCAGTAA CATCTCTTGC ATCAATGACA TTTAACGAAT CCCAAAATAA "5040 

ACGACCAATG CAACGATTTG AATTACGCCA AGCCATTTTA GCACCATAAA TAAGTTCTTC 5100 

TTCTGTATGT GTATATGTCC CAGTTTCTTT TATTTCTAGT TCAATGTCAT GTAAACGTTT 5160 

ATTGATAATT TGCGTTTCAT AATGACACTC TTTATACATG TTTTCTATGA AAGCTTGAGG 5220 

CTCTTTAAAT AACATTAACA ACACCTCGCT TTATATTATA GTCTACATTA TTAAAATACT 5280 

CTTAAAAATT ATGTATATGT CATTAAATTG TTGGTTGATT TTAATTAAAA GTATGGAAAT 5340 

TAAGGGGCTC TTATGTATAT AAAAAAATGA ATTATGATAA AATGTAAGAA AATATTTAGG 5400 

SO TCGATTGGAG AGATACAAGT GTACCAATTA GAAGACGACA GTTTAATGTT ACATAATGAC 5460 

TTATATCAAA TAAATATGGC TGAAAGTTAT TGGAATGATA ATATTCATGA AAAAATGGCT 5520 

GTATTTGATT TGTATTTTAG AAAAATGCCA TTTAATAGTG GCTATGCTGT TTTTAATGGT 5580 
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TTAAAGTCTA 


. TTGGCTACAA GGATGATTTC 


ttatcatatt: 


' TAAAAGATTT 


( AAAATTCACA 


5700 




GGCAGCATCC 


GTTCGATGCA 


, AGAAGGCGAA 


TTATGCTTTG 


GTAACGAACC 


ATTGTTACGC 


5760 


5 


GTAGAAGCAC 


CATTGATTCA 


AGCG CAATTA 


ATAGAAACAA 


TTTTATTAAA 


CATTGTAAAT 


5820 




TTCCATACAT 


TAATTACAAC 


AAAGGCTAGC 


AGAATTCGTC 


AAATTGCATC 


AAATGATAAA 


5880 


10 


TTAATGGAGT 


TTGGTACACG 


TCGTGCGCAA 


GAAATTGATG 


CAGCATTGTG 


GGGCGCTAGA 


594 0 




GCTGCTTACA 


TCGGGGGCTT 


TGATTCTACA 


AGTAATGTTA 


GGGCGGGGAA 


ATTATTTGGT 


6000 




ATACCTGTGT 


CTGGTACACA 


TGCACATGCA 


TTTGTCCAAA 


CTTATGGAGA 


CGAATATGTT 


6060 


15 


GCCTTCAAAA 


AATATGCTGA 


AAGAGATAAA 


AATTGTGTGT 


TCCTAGTAGA 


TACATTCCAT 


6120 




ACTTTAAAAT 


CTGGCGTGCC 


AAATGCAATA 


AAAGTTGCAA 


AAGAATTAGG 


TGACAAAATT 


6180 


20 


AACTTTGTAG 


GTATTCGATT 


AGATTCTGGA 


GATATCGCTT 


ATTTATCTAA 


AGAGGCAAGA 


6240 


CGTATGCTTG 


ATGAAGCAGG 


ATTTACTGAA 


ACTAAAATTA 


TCGCGTCTAA 


TGATTTGGAT 


6300 




GAAGAAACGA 


TTACGAGTTT 


GAAAGCACAA 


GGTGCAAAAG 


TAGATTCTTG 


GGGCGTTGGT 


6360 


25 


ACAAAGCTGA 


TTACAGGATA 


CGATCAACCA 


GCATTAGGTG 


CAGTATATAA 


ACTTGTAGCT 


6420 




ATTGAAAATG 


AAGATGGTTC 


ATATAGTGAT 


CGTATTAAAT 


TATCAAATAA 


CGCTGAAAAG 


6480 




GTTACGACGC 


CAGGTAAGAA 


AAATGTATAT 


CGCATTATAA 


ACAAGAAAAC 


AGGTAAGGCA 


6540 


30 


GAAGGCGATT 


ATATTACTTT 


GGAAAATGAA 


AATCCATACG 


ATGAACAACC 


TTTAAAATTA 


6600 




TTCCATCCAG 


TGCATACTtA 


TAAAATGAAA 


TTTATAAAAT 


CTTTCGAAGC 


CATTGATTTG 


6660 




CATCATAATA 


TTTATGAAAA 


TGGTAAATTA 


GTATATCAAA 


TGCCAACAGA 


AGATGAATCA 


6720 


35 


CGTGAATATT ( TAGCACTAGG ATTACAATCT ATTTGGGATG AAAATAAGCG 


TTTCCTGAAT 


6780 




CCACAAGAAt 


ATCCAGTCGA 


TTTAAGCAAG 


GCATGTTGGG 


ATAATAAACA 


TAAACGTATT 


6840 


40 


TTTGAAGTTG 


CGGAACACGT 


TAAGGAGATG 


GAAGAAGATA 


ATGAGTAAAT 


TACAAGACGT 


6900 


TATTGTACAA 


GAAATGAAAG 


TGAAAAAGCG 


TATCX3ATAGT 


GCTGAAGAAA 


TTATGGAATT 


6960 




AAAGCAATTT 


ATAAAAAATT 


ATGTACAATC 


ACATTCATTT 


ATAAAATCTT 


TAGTGTTAGG 


7020 


HO 


TATTTCAGGA 


GGACAGGATT 


CTACATTAGT 


TGGAAAACTA 


GTACAAATGT 


CTGTTAACGA 


7080 




ATTACGTGAA 


GAAGGCATTG 


ATTGTACGTT 


TATTGCAGTT 


AAATTACCTT 


ATGUAGTTCA 


714 0 




AAAAGATGCT 


GATGAAGTTG 


AGCAAGCTTT 


GCGATTCATT 


GAACCAGATG 


AAATAGTAAC 


7200 


SO 


AGTCAATATT 


AAGCCTGCAG 


TTGATCAAAG 


TGTGCAATCA 


TTAAAAGAAG 


CCGGTATTGT 


7260 




TCTTACAGAT 


TTCCAAAAAG 


GAAATGAAAA 


AGCGCGTGAA 


CGTATGAAAG 


TACAATTTTC 


7320 




AATTGCTTCA AACCGACAAG 


GTATTGTAGT 


AGGAACAGAT 


CATTCAGCTG 


AAAATATAAC 


7380 
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, TAAACGACAA GGTCGTCAAT TATTAGCGTA TCTTGGTGCG CCAAAGGAAT TATATGAAAA 7500 

AACGCCAACT GCTGATTTAG AAGATGATAA ACCACAGCTT CCAGATGAAG ATGCATTAGG 7560 

TGTAACTTAT GAGGCGATTG ATAATTATTT AGAAGGTAAG CCAGTTACGC CAGAAGAACA 7620 

AAAAGTAATT GAAAATCATT ATATAOGAAA TGCACACAAA CGTGAACTTG CATATAGAAG 7680 

ATACACGTGG CCAAAATCCT AATTTAATTT TTTCTTCTAA CGTGTGACTT AAATTAAATA 7740 

TGAGTTAGAA TTAATAACAT TAAACCACAT TCAGCTAGAC TACTTCAGTG TATAAATTGA 7800 

AAGTGTATGA ACTAAAGTAA GTATGTTCAT TTGAGAATAA ATTTTTATTT ATGACAAATT 7860 

1S CGCTATTTAT TTATGAGAGT TTTCGTACTA TATTATATTA ATATGCATTC ATTAAGGTTA 7920 

GGTTGAAGCA GTTTGGTATT TAAAGTGTAA TTGAAAGAGA GTGGGGCGCC TTATGTCATT 7980 

CGTAACAGAA AATCCATGGT TAATGGTACT AACTATATTT ATCATTAACG TTTGTTATGT 8040 

20 AACGTTTTTA ACGATGCGAA CAATTTTAAC GTTGAAAGGT TATCGTTATA TTGCTGCATC 8100 

AGTTAGTTTT TTAGAAGTAT TAGTTTATAT CGTTGGTTTA GGTTTGGTTA TGTCTAATTT 8160 

AGACCATATT CAAAATATTA TTGCCTACGC ATTTGGTTTT TCAATAGGTA TCATTGTTGG 8220 

TATGAAAATA GAAGAAAAAC TGGCATTAGG TTATACAGTT GTAAATGTAA CTTCAGCAGA 8280 

ATATGAGTTA GATTTACCGA ATGAACTTCG AAATTTAGGA TATGGCGTTA CGCACTATGC 834 0 

TGCGTTTGGT AGAGATGGTA GTCGTATGGT GATGCAAATT TTAACACCAA GAAAATATGA 8400 

ACGTAAATTG ATGGATACGA TAAAAAATTT AG ATC CGAAA GCATTTATCA TTGCGTATGA 84 60 

ACCTCGAAAC ATACATGGTG GATTCTGGAC TAAAGGCATT CGTCGTAGAA AGCTTAAAGA 8520 

TTATGAACCA GAAGAACTGG AAaGTGTAGT AGAaCATGAA aTTCmAAGTA AaTGAGAaTG 8580 

AAmCAATtGC TGATTGTTTG TCACGAATGA AAtGCAAGGG TATATGCCGG TAAAACGTAT 864 0 

TGAAAAACCC GTGTTTCAAG AGCAAAAAGA TGGCACGGTT GAAGTATCAC ATCAAGAAAT 8700 

CGTTTTTGTA GGTAAGAAAA TCCAATAACA TAATC CAATT TAAATAAAGA CTATTTGAAG 8760 

AGGAAAGGCT ATTCAAAGTT TGAGTAATTT TACTTTGAAT AGCCTATTTG TTTATACATG 8 820 

CAAGATGCTC GATCCATATT GTATGAGAAA CCCCCAGCAA G CTAT AT AAA GCATATGCTG 8880 

GGGGTTCTTA ATATTTTAAA AATTATTGTT AGATTATATA TATCGTCGCT TTTTCTAAAA 8940 

CAATCTCATC GCATGAAATT TTTTCTTCCT AGAGACCTTT AATAAGATTA ATAGTTTACT 9000 

TAATCATATC TAGATAGTCT TATGACTTAT GCTTAATGAA AGTCATTCTA GGAGAAGTTC 9060 

CCAAAGCTTC TGTGTTCATA . ATTGTTAGTA GTATTTTATT ATCATTTGGT ATAAATATTT 9120 

CAATAACAAT TGAGCTATTA TTTTTATTAT ATAATGTGAG TTGTTTGTGT TCTGTATTTA 9180 
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TO 



15 



20 



25 



CATTTAAATC 


TTGAGGATGC 


CATTCTCCCT 


CAATAATATT 


AAGATAATAC 


TTAGCCTCTG 


9300 


AATTACATTT 


GAATTTATCA ATACTAAATA ATTCAATTTG 


TTCCATAATA 


TTATTTACCT 


9360 


TTCTAAAATA 


CAAATTTTAA 


TAACCATAAA 


TAGATGAATA 


CCATCGATAA 


TGGTCGCCAT 


9420 


TGGATACTGG 


AATAACATTG 


TTTTTAGCAT 


CTTGAGTCAT 


AAAACCATTA 


TCCCATGGAT 


9480 


TCCATATAAT 


TATAACCTCT 


TGTCCATTAT 


CTAATTTAGC 


GTTCCCAACA 


ACTGCCATGG 


9540 


CATGCCCTGC 


GTGCATACCA 


TTTCTTGATT 


CTACTCTACT 


ACCTAAAACA 


GCAATTCCTT 


9600 


TATTATTTTT 


AGTAAGATTG 


TCAACTTCAT 


TATATGTAGT 


CATTCTATTA 


AGAAGTTGTG 


9660 


GACTTCTTCC 


CTGAGTTTGT 


CCAAAATAAA 


TCATCTCTCT 


TGGCGTTAAA 


CCAGTAAATT 


9720 


GGAATCGTTG 


TCCTTGTAAG 


TTTGGGTGTA 


AAAATCTCAT 


CACAGCTTCT 


GCATGATATT 


9780 


TGTTAGTATT 


ATAAGTCGCA 


TTTAGTAATT 


CAGACATCGT 


ATAGCCTGCA 


CACCAACCAT 


984 0 


TGTTACCTTG 


AGTTTCTCTT 


ATCTTGAAAT 


TCTCAAGTTT 


ATTTATATAX 


*TG s TCGTTGT 

<x x \»w x x \J x 


9900 


AAGTATAATT 


ATTACTTTTA 


AATTGACTAG 


TTGGCATAGT 


GACAGAAGCT 


TTTTGCTTTA 


9960 


GTTGCGTTAC 


ATTATTGCGA 


GTAGGTATAG 


TCTCAGTCTT 


TnTnAACTnT 


nTATCTTCTA 


10020 


GACGTGGTGT 


TTTTAGTACT 


AGTTTAGCTT 


TATGATTTTG 


AGTACCACAT 


AGTAACCTTT 


10080 


TGAGTTGT 












10088 



(2) INFORMATION FOR SEQ ID NO: 33: 

30 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7563 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
35 ( (D) TOPOLOGY: linear 

r (Xi> SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

40 CGGAAACGnA CCCnATGCGT ATGCTTGACG TGCCAAAATT AAATACGAAG TTCATAGCTT 60 

TGAGGTACCA G AAG AACATT TATCTGGTCA AGAAGTCGCA GnACTCATAC AAGCAAATGT .120 

TAAAACAGTA TTTAAAACGC TTGTTCTAGA AAATACAAAA CATGAACATT TTGTATTTGT 180 

45 

TATCCCAGTA AGTGAAACTT TAGATATGAA AAAGGCAGCT GCTTTGGTTG GAGAGAAGAA 240 

ATTGCAGCTT ATGC CTTTAG ATAATTTGAA AAATGTAACG GGATACATTC GTGGTGGGTG 300 

TTCGCCTGTT GGTATGAAAA CATTGTTTCC AACAGTCGTT GACAAATCGT GTGAAAATTA 360 

so 

TAGTCATATC AGTGTGAGTG GTGGGCTTCG AACAATGCAA ATCACAATAG CTGTTGAGGA 420 

TTTGATTACA ATAACTAAAG GCAAAATTGG AGCAGTTATC CATGAATGAT TAATAACAAC 480 
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TGCCACACTC CTTTTTGATT GAATTAGCAT TTTACGATCA TAAACAGTCA TTATAATTGA 600 

GTATTTGAAC ATAAAAATGT AATTTTATCG TAACAATTTG AGTGTTTGTG ATTGTTTTTG 660 

GTAATTTATG ATTGAAAAGT GAAAGCGTAC TCATTATAAT ACAAAGTGAG ATGGGGTGAT 720 

GATGATAATT ACTGaAAAAA GACACGAGTT AATATTAGAA GAACTTTCGC ACAAAGATTT 780 

TTTGACTTTA CAAGAATTAA TAGATCGAAC TGGTTGCAGT GCTTCAACAA TACGArGAGA 840 

TTTATCTAAA CTACAACAAT TAGGGAAATT GCAACGTGTG CATGGTGGTG CAATGTTAAA 900 

AGAAAATCGT ATGGTTGAGG CGAATTTAAC TGAAAAATTA GCAACGAATC TTGATGAAAA 960 

GAAAATGATT GCTAAAATAG CAGCTAATCA AATCAACGAT AATGAATGCT TATTTATCGA 1020 

TGCTGGTTCA TCTACATTGG AGCTAATTAA ATATATTCAA GCGAAAGATA TCATTGTGGT 1080 

AACCAATGGT TTAACACATG TAGAAGCTTT ACTTAAAAAA GGTATTAAAA CAATTATGCT 1140 

20 AGGTGGTCAA GTTAAAGAAA ATACACTTGC TACGATTGGT TCTAGTGCTA TGGAGATATT 1200 

AAGACGATAT TGTTTCGATA AAGCTTTTAT CGGGATGAAT GGATTAGATA TTGAACTTGG 1260 

ATTAACTACT CCCGATGAGC AAGAGGCATT AGTTAAACAA ACAGCAATGT CATTAGCCAA 1320 

25 TCAATCATTT GTACTTATAG AT CATTCT AA GTTTAATAAA GTATATTTTG CTCGTGTACC 13 80 

TTTGCTAGAA AGTACGACAA TCATCACATC TGAAAAAGCA TTAAATCAAG AATCGTTAAA 14 40 

AGAATACCAA CAAAAGTATC ACTTTATAGG AGGGACTTTA TGATTTATAC AGTGACTTTC 1500 

30 r, 

AATCCTTCAA TTGACTATGT CATTTTTACG AATGATTTTA AAATTGATGG TTTGAACAGA 1560 

GCAACAGCAA CATATAAATT CGCTGGGGGG AAAGGTATTA ATGTCTCGCG CGTCTTAAAG 1620 

35 ACATTGGATG TTGAGTCAAC TGCCTTGGGA TTTGCAGGTG GATTTCCTGG GAAATTCATT 168 0 

ATAGATACAT TAAATAACAG TGCAATTCAA TCGAATTTTA TTGAAGTTGA TGAAGATACA 174 0 

CGTATTAATG TGAAATTAAA AACAGGACAA GAAACAGAAA TCAATGCACC GGGTCCTCAT 18 00 

40 ATAACGTCAA CACAATTTGA ACAACTGTTA CAACAAATTA AAAATACAAC AAGCGAAGAT 1860 

ATAGTTATTG TTGCTGGAAG TGTACCAAGT AGTATTCCAA GCGATGCGTA TGCGCAAATT 192 0 

GCACAAATTA CAGCACAGAC AGGTGCTAAA TTAGTAGTCG ACGCTGAAAA AGAATTGGCT 1980 

GAAAgCGTTT TACCATATCA TC CACTATTT ATTAAACCTA ATAAAGATGA ATTAGAAGTG 204 0 

ATGTTTAATA CAACAGTGAA CTCAGACACA GATGTTATTA AATATGGTCG TTTGTTAGTT 2100 

GATAAAGGTG CGCAATCTGT TATTGTCTCG CTTGG CGGTG ATGGTGCTAT TTATATTGAT 2160 

AAAGAAATCA GTATTAAAGC AGTTAATCCA CAAGGGAAAG TGGTTAATAC AGTTGGCTCT 2220 

GGTGATAGTA CAGTTGCAGG CATGGTGGCT GGAATTGCTT CAGGTTTAAC GATTGAAAAA 22 80 
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CGGGACGCTA 


TAGAAAAAAT 


AAAATCACAA 


GTTACGATTA 


GCGTACTTGA 


TGGGGAGTGA 


2400 




AAATAATGAG 


AGTAACAGAG 


TTATTAACAA 


AAGATACAAT 


AGCAATGGAT 


TTAATGGCAA 


2460 


s 


ATGACAAAAA 


TGGTGTTATT 


GATGAGTTAG 


TAAATCAATT 


AGACAAAGCA 


GGTAAATTAA 


2520 




GTGATGTCGC 


GTCATTTAAG 


GAAG CGATTC 


ACAATCGAGA 


ATCACAAAGT 


ACAACTGGTA 


2580 


10 


TCGGCGAAGG 


TATTGCCATT 


CCACATGCCA 


AAGTGGCCGC 


AGTTAAGTCA 


CCAGCTATTG 


2640 


CGTTTGGTAA 


AtCTAAAGCA 


GGCGTAGATT 


ATCAAAGTTT 


GGATATGCAA 


CCAGCACACT 


2700 




TATTCTTTAT 


GATTGcAGcG 


CCAGAAGGTG 


GCGCCCAAAC 


ACATCTAGAT 


GCTTTAGCTA 


2760 


15 


AGTTGTCTGG 


TATTTTAATG 


GATGAAAATG 


TACGTGAGAA 


ATTATTACAT 


GCTTCATCAC 


2820 




CTGAAGAAGT 


ACTAGCGATC 


ATAGATGAGG 


CTGATGATGA 


AGTGACAAAA 


GAAGAAGAGG 


2880 




CAGAAGCTGA 


AGCACAACAA 


GTTGCAACTG 


CAGAACAATC 


ATCTAAACAA 


TCTAATGAGC 


2940 


20 


CATATGTGTT 


AGCAGTAACT 


GCTTGTCCAA 


CAGGTATTGC 


ACACACATAT 


ATGGCACGTG 


3000 




ATGCATTGAA 


AAAGCAAGCG 


GATAAAATGG 


GTATTAAAAT 


TAAAGTAGAA 


ACGAATGGTT 


3060 




CAAGCGGCAT 


TAAAAACCAT 


TTAACTGAAC 


AAGATATTGA 


AAATGCAACA 


GGTATCATTG 


3120 


25 


TTGCTGCTGA 


TGTTCATGTT 


GAGACGGATC 


GCTTCGATGG 


TAAAAATGTC 


GTAGAAGTAC 


3180 




CAGTAGCAGA 


TGGTATTAAA 


CGCCCAGAAG 


AATTAATTAA 


TAAAGCATTA 


GATACAAGTC 


3240 


30 


GTAAACCTTT 


TGTTGCCCGT 


GATGGTCAAA 


GAAAAGGTAA 


CTCAAATGAC 


AGTCAAGAAA 


3300 


AATTAAG CCC 


AGGTAAAGCA 


TTCTATAAAC 


ACTTAATGAA 


CGGTGTTTCT 


AACATGTTGC 


3360 




CACTTGTAAT 


ATCTGGTGGT 


ATTTTAATGG 


CAATTGTATT 


TTTATTTGGA 


GCAAATTCAT 


3420 


35 


TTAATCCAAA 


AAGCTCAGAG 


TACAATGGGT 


TTGCAGAGCA 


GCTTTGGAAC 


ATTGGTAGTA 


3460 




AAAGTGCATT 


CGCGTTAATC 


ATTCCAATTT 


TATCTGGATT 


GATTGCACGT 


AGTATTGCGG 


3540 




ATAAACCTGG 


TTTCGCTTCA 


GGTCTTGTAG 


GTGGTATGTT 


AGCAATTTCA 


GGTGGTTCAG 


3600 


40 


GATTTATTGG 


TGGTATTATT 


GCAGGTTTCT 


TAGCAGGTTA 


CTTAACACAA 


GGTGTTAAAG 


3660 




CCATGACACG 


TAAGTTACCA 


CAAGCATTAG 


AGGGATTAAA 


GCCAACATTA 


ATTTATGCAC 


3720 




TATTAACAGT 


GACGGCTACA 


GGCTTATTGA 


TGATTTATGG 


CTTTAATCCA 


CCAGCATCTT 


3780 


45 


GGTTAAATCA 


TTTGTTATTA 


GATGGATTAA 


ACAATTTATC 


AGGTTCTAAT 


ATTGTATTAT 


3840 




TAGGTTTAGT 


TATTGGCGCT 


ATGATGGCGA 


TTGATATGGG 


CGGTCCATTC 


AACAAAGCGG 


3900 


50 


CATATGTTTT 


TGCAACAGGT 


GCGTTGATTG 


AAGGTAATGC 


AGGACCAATT 


ACAGGTGCAA 


3960 


TGATTGGTGG 


TATGATTCCA 


CCGTTAGCAA 


TTGCGACAGC 


GATGTTAATT 


TTTAGACGTA 


4020 




AATTT ACAAA 


AGAACAACGT 


GGTTCAATTA 


TCCCTAACTA • 


TGTGATGGGT 


ATGTCATTTA 


4080 
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TGATTGGTTC AGGTATAGGT GGCGCAATTG 
CACATGGTGG TATTATTGTA ATTGTTGGTA 

5 

TTGCACTTCT AGTTGGCACA TTAGTTTCAG 
TAACTGAAAC AGAAATCGAA GCTTCAAAAT 
TGATTGTTAG CAAAGAGCTT CATATTAAGT 

10 

TATATCGTGT TAACGGTAGC TTATACAAAG 
TTATGAATTG ATATGAAAGT GTTTTTATTT 

1S CAAATGTATA GACTTTTTTA ATATTTTGCA 

AAAATATGAG TGTCTTAAAG TGAAAATTTA 
TTAATTATAT ATAACGGCAA AGTTTATACT 

20 CATGTGAAAG ATGGACAGAT TGTTGCAATT 

AATGATACGA CAAATAAAAT TCAAGTGATT 
TTTATTGATA TACATATTCA TGGTGGTTAT 

25 GGCTTAAAAT ATCTATCCGA AAATTTGTTG 

ACAATGACGC AATCGACTGA TAAAATAGAT 
GCGGAgCAAG ATGTTCACAA TGCAGCGGAA 

30 

ATATCTGAAA ATAAAGTTGG TG CTCAACAT 
AAAATTAAAC ATTTTCAAGA GACTGCTAAC 

35 r GAAATTGAAG GTGCAAAAGA AG CGCTTGAA 
GGTCATACAG TAGCAACATA CGAAGAAGCA 
GTCACGCATT TATATAATGC AGCGACGCCA 

40 GCAGCATGGT TGAATGATGC TCTACATACC 

CCGGCATCGG TTGCAATTGC TTACCGTATG 
GATGCAATGC GTGCAAAAGG TATGCCTGAA 

45 

ACTGTTCAAT CGCAACAAGC ACGTCTTGCA 
ATGAATCATG GGTTACGTAA CTTAATATCA 
CGAGTAACAA GTTTAAAT CA AGCCATTGCA 

so 

AAAGTAAATA AGGATGGAGA TCTTGTTATT 
ATAAAACAAG GCAAGGTTCA CACATTTAGC 

55 



3 0 786 519 A2 

CTTTAGGCTT AGGTTCACGA ATTACTGCGC 4200 

CTGATGGTGC ACACTTACTT CAAACTCTTA 4260 

CATTAATTTA CGGTTTAATC AAACCAAAGT 4320 

CAATGGACGA GTAGTTTTAA TGATGTAAAA 4380 

TGTATGTTCA ATGAATATAT GTTAGTTTTA 444 0 

CTGTAAAAAC ACTTTCTATT AATTCAGTTT 4500 

TTAGATAAAT GAATGAAGAA ATAGACAC CA 4560 

AAAAGTTATG CCAAACGAAG CAGATATAGT 4620 

TAAATAAAGA AGGGTTTATA CGTGTCAGAA 4680 

GAAGATGGCA AAATCGATAA TGGTTACATT 4740 

GGAGAAGTGG ATGATAAAGC AGCAATTGAT 4800 

GATGCTAAAG GTCATCATGT ATTACCAGGT 4860 

GGTCAAGATG CAATGGATGG GTCATACGAT 4920 

TCTGAAGGGA GGACATCATA CTTGGCCACT 4 98 0 

AATGCACTTA CAAATATTGC, TAAATATGAA 504 0 

ATTGTAGGTA TACATTTAGA AGGAC CATTT 510 0 

CCGCAATACG ■ TTGTACGCCC ATTTATCGAT 5160 

GGATTAATAA AGATTATGAC GTTTGCACGT 5220 

ACGTATAAAG ATGACATTAT TTTTTCAATT 52 BO 
GTTGAAGGTG TTGAGCGAGG AGCTAAACAT . 534 0 ' 

TTCCAACATA GAGAACCAGG TGTTTTTGGA 5400 

GAAATGATTG TTGATGGCAC TCATTCTTCAT 54 6 0 

AAAGGTAATG AACGTTTTTA TTTAATTACC 5520 

GGAGAATATG ATTTGGGTGG ACAAAAAGTA 5580 

AATGGTGCGC TTGCTGGTAG TATTTTAAAA 5640 

TTTACAGGTG ATACATTAGA TCATTTATGG 5700 

TTAGGTATCG ATGATAGAAA AGGTAGTATT 5760 

CTAGATGATG ATATGAATGT AAAATCTACA 5820 

TAATAAATAA TCATAATTAA ATGTATGCAA 5880 
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TTTTCTGGGG GTGTCTAAAT GGGAAGGCGA TAACATGTAG TTGTAATTTA AGTCATAGTG 6000 

ATAAATTTGA ATGCGTGTTA CCCATGAGTG ACACATATAA CATGGAGGTG AATCCCTAGA 606 0 

AATAGGGAAT TAATTGGAAA CTTCGACCAT AATTAGTTTG ATTATATTTA TTCTATTAAT 6120 

TGCATTAACC ACTGTATTTG TTGGTTCAGA ATTTGCATTA GTAAAAATTA GAGCAACAAG 6180 

AATTGAACAG CTAGCAGATG AAGGAAATAA ACCTGCTAAA ATAGTAAAAA AGATGATTGC 6240 

TAATCTAGAT TATTATCTTT CTGCTTGTCA GTTAGGTATA ACAGTAACAT CTTTAGGGTT 6300 

AGGTTGGCTT GGTGAACCAA CX3TTTGAAAA GCTATTACAC CCAATATTTG AAGCAATCAA 63 60 

TTTACCAACT GCATTAACGA CGACGATTTC GTTTGCAGTG TCATTTATAA TCGTTAOGTA 6420 

TTTGCATGTA GTACTTGGTG AATTAGCGCC TAAATCTATA GCTATTCAAC ATACTGAAAA 6480 

GCTTGCTTTA GTATATGCAA GACCATTGTT CTATTTCGGT AACATTATGA AACCATTGAT 6540 

20 TTGGCTGATG AATGGTTCTG CACGTGTTAT TATTAGAATG TTTGGTGTAA ATCCTGATGC 6600 

CCAAACTGAT GCAATGTCAG AAGAAGAAAT CAAAATTATT ATTAACAATA GTTATAATGG 6660 

TGGAGAAATC AACCAAACTG AATTGGCATA TATGCAAAAT ATCTTTTCAT TCGATGAAAG 6720 

ACATGCAAAA GATATAATGG TACCTAGAAC TCAAATGATT ACACTAAATG AACCTTTTAA 6780 

TGTAGACGAA TTACTAGAAA CAATAAAAGA ACATCAATTT ACGCGTTATC CAATTACTGA 684 0 

TGATGGTGAT AAAGACCACA TTAAAGGATT TATTAACGTC AAAGAATTTT TAACTGAATA 6900 

CGCTTCTGGA AAAACGATTA AAATAGCAAA CTATATaCAT GAGTTGCCAA TGATTTCAGA 6960 

GACAACACGT ATCAGTGATG CATTAATTAG AATGCAACGT GAACATGTAC ATATGAGTCT 7020 

TATTATAGAT GAATATGGTG GAACGGCAGG TATTTTAACG ATGGAAGATA TTTTAGAAGA 7080 

AATCGTTGGA GAAATTCGTG ATGAATTTGA TGATGATGAA GTGAATGATA TCGTTAAAAT 7140 

TGAT5ATAAG ACATTCCAAG TAAATGGCAG AGTACTATTG GATGATTTAA CTGAAGAGTT 7200 

40 CGGTATAGAA TTTGATGACT CTGAGGATAT TGATACGATA GGTGGATGGT TACAATCTCG 7260 

TAATACCAAT TTACAAAAAG ATGATTACGT GGATACAACT TATGATCGCT GGGTTGTTTC 7320 

AGAAATCGAT AACCACCAAA TTATTTGGGT GATATTAAAC TATGAATTTA ATGAAGCGAG 73 80 

ACCTACTATC GGACAGTCTG ATGAAGATGA AAAATCAGAA TAGATATTAA TATATAAACC 7440 

AACTAAGAAT GATTTAATTC ATTTTTGGTT GGTTATTTTT TTGACTAAAA TTAAnGAAAA 7500 

GTGAAAATAG TATTGGAACT CAATATCTTT AATGATTTAA TGAATAAnTT TTATTGAAAG 7560 

CGA 7563 
(2) INFORMATION FOR SEQ ID NO: 34: 

55 



35 



45 



50 



325 



EP0 786 519 A2 



10 



20 



25 



(A) LENGTH : 3492 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

TT AT AT CAAC TTCATGGCGG AACCATTGAT GACCCATTAG ACGAAACAAT AAGCGCATTT 60 

sATGAATTGA AACAAGAAGG AATTATACGT GCTTACGGTA TTTCTTCTAT TCGCCCAAAT 120 

GTAATTGATT ATTATTTAAA ACATAGTCAA ATCGAAACGA TAATGTCTCA ATTCAATTTG 180 

15 ATTGATAATC GTCCAGAATC ATTATTAGAT GCAATTCACA ACAATGATGT TAAAGTATTG 240 

GCAAGAGGAC CTGTGTCTAA AGGATTATTA ACTTCAAACA GTGTTAATGT GCTCGACAAT 300 
AAATTTAAAG ATGGTATTTT TGATTATTCT CATGATGAAT TGGGTGAAAC AATAGCCTCT . 360 

ATTAAAGAAA TTGAAAGTAA TTTATCTGCA TTGACATTTA GTTATTTAAC ATCACATGAC 420 

GTGCTTGGTT CCATCATTGT AGGTGCAAGT AGCGTCGACC AATTAAAAGA AAATATTGAA 4 80 
AACTATCATA CTAAAGTTAG TTTAGATCAG ATTAAAACAG CAAGAG CTCG TGTAAAGGAT . 540 

TTGGAATATA CCAATCATTT AGTGTAGAAG TCATTTTCAG TAATAAAAAC AG CAG CATGA 600 

GGCGTTTCAT TATAAAAATG CCTTACTGCT GTTGTTTATG TACAATTCGC TATAATTTAT 660 

30 GATTATGATT ACTCACTTAT GATAGAAATT AAAGCGTTGT CCTCACGCAT CAGTATTTAG 720 

TAATTTCGCC TTGCGGCATT GC CTTAAGCA AACTTCTGCC ACTTCATCTC TTAATAATTT 780 

TATTAAAACA TCTTTCTATA TTTCACTTCG CATGTTGATT CATCATTATT AGTTATTATT 84 0 

35 TGTACACCCA GCACATTTCC TTGCAACACA AGTAGTTTGA ATTTTTCACA AGTATAATAT 900 

AATGTACCGT CTGAAATTTG GTCTACAGAA ATATCGCCTA AAATATCCAG CACTGTAAAT 960 

TCTTCAAATA CTGATAGTTG TTCCGCATAT CGTACACAAA GTCTTACCAC ACTCTCCGAT 1020 

40 - 

TGACAGTTCA TTGCCATCCC ACCTATTTAT GCTTTATTTT TAAATAATTT AGGGAAACAT 10 80 

CGTTCAAAAA ATCTAGGCGC AATTTGATAC ATTTTCAACG CATGaTGCAT CCATTTAGGC 114 0 

CGATTAATTT CCAATTGTTT TGTTTTAATG C CAT AAATG A TATCTTCTGC AAGCTGATTA 1200 

GCATCAAGCA TAATTTCCCC CATCTTTTTA gCATACTTCA TTGATGGGTC GGCTTTTTGA 1260 

TGAAAAGGTG TATCAATCGG GCCAACATTA ACTGTCATGA TATGTAAGTT TGGTGACTCT 1320 

AGTCTTAAAG CATTCATTAA TGCATAAAAC CCTGCTTTCG ATGCCCCATA ATGTGCAGCA 13 80 

TTTGCTTGTG TGGAAAATGC AGCTTGACTT GAAATACCTA CAATATGTGC GTTAGATGTT 144 0 

AAATATGGTC TCAACACAGT ATATAAAACA TTAAAACTAA TTAAATTAAG CTGATACGTT 1500 
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TAAATGAATC CATCGAATGA TGTATTGTCT TCAAATTGCA GTGCCTGTAT CGACTTCAAA 1620 

TCATTTAAGT CACAAGGAAT AACATTTATA GTTTTCCCCA ATTCCTGTTC AAAGATTCTA 1680 

GTTGCTTTAT CAACATCACG CACCAACAAC GTTACATGCA CTTTATTTTC TAGTAACTTT 174 0 

CGGACAATCG ATAAACCTAA ACCACTCGTA CCACCAGTCA CTATAAAATG TTGTCCTTTG 1800 

ATCAATTAAC CTTCCTTTTC AATTATATAG AATGCAATTT ATCAACTTTA. CATAATTGAG I860 

ACAAGTTGAT TATCTTTCCT AATATATATA CAATAATAAG AAAATATAAC ATACAAATCA 1920 

AAAACTAAAG GGATGTGaCG TTAATGrAAC TCGTATTTTA TGGAGCTGGT AATATGGCAC 198 0 

y5 AAGCTATATT TACAGGrATT ATTAACTCmA GCAACTTAGA TGCCAATGAT ATATATTTAA 2040 

CAAATAAATC TAATGAACAA GCTTTAAAAG CATTCGCTGA AAAACTAGGT GTTAACTATA 2100 

GTTATGAtGA TGCGACATTA TTAAAAGATG CAGAyTATGT ATTTTTAGGT ACCAAACCAC 2160 

20 ATGACTTTGA TGCTCTAGCA ACACGCATCA AACCACATAT TACAAAAGwC AATTGCTTCA 2220 

TTTCAATTAT GGCAGGTATT CCGATTGATT ATATTAAACA ACAATTAGAA TGCCAAAATC 2280 

CaGTTGCTAG AATTATGCCA AACACAAATG CGCAAGTTGG ACACTCTGTT ACTGGCATTA 234 0 

GTTTTTCAAA CAACTTTGAC CCTAAATCTA AAGATGAAAT TAACGATTTA GTTAAAGCAT 2400 

TTGGTTCTGT AATTGAAGTA TCAGAAGATC ATTTACATCA AGTAACAGCT ATCACCGGAA' 24 60 

GCGGCCCAGC ATTTTTATAT CATGTATTCG AGCAATATGT TAAAGCTGGT aCs AAACTTG 2520 

GTCTAGAAAA AGAACAAGTT GAAGAATCTA TACGCAACCT TATTATAGGT ACAAGTAAGA 2580 

TGATTGAACG TTCAGAtTTG AGCATGGCTC AATTAAGAAA AAATATTACC TCTAAAGGTG 2640 

GTACGACACA AGCTGGCCTT GATACATTGT CACAATATGA TTTAGTATCT ATTTTCGAAG .2700 

ATTGTCTAAA CGCTGCCGTC GACCGTAGTA TTGAACTTTC TAATATAGAA GACCAATAAA 2760 

AACAgACCCG CCAACACATG T ATG CAT CAT CGCAAGCACT GTGTTTGACG GGTTATTTTT 2B20 

40 ATAATTTATT GTTATTTGGC AAGCATTGTT TATTACTTTG TCATTAGATT TTAAAACTAT 2880 

CAAAATCTTT TACAAAATTA AAATTAGGTG TATCTTCATT TTGTATCAAT GTTTGATAAA 2940' 

TTTCATTTAT ATCTTCTGTA TTATAGCGAT TGCTCAAATG TGTAATCAAC GTACGTTTAA 3000 

45 CATTGGCTTC TTTTATCAAT GCAAATACGT CTTCAATATG GCTATGATGA TAATTGTTGG 3060 - 

CTAAATGCTT TTCACCATCT ATATAGGTCG CTTCATGTAC CATCACATCA GCATCTCTAG 3120 

AAATCACACG TTCATTAGAA CATGGTTTTG TATCACCAAA AATTGCTACA ACTGGACCCT 3180 

SO 

GTTTGGACTC ACCT CTAAAA TCTTTTGATT GATAAACTTG ACCATTATGT TCAAATGTAT 324 0 

CATGAGATTT TACTTCTTGA TATTTAGGAC CTGGTTCAAG ACCAATGTTT TTTAACGCTT 33 00 
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CATGATTAAG TAAATGCGCC TCTACAGTAA AACCATCCAT GATGATATGT CAGATGATCA 3420 

TCGATTTCAA TATATGtAAT TGGATAGTTT AAATGTGACT CTGATAAATT CATAGACATT 34 80 

5 TCCACATATG CT 34 92 

(2) INFORMATION FOR SEQ ID NO: 35: 

( i ) SEQUENCE CHARACTERISTICS : 
10 (A) LENGTH: 1973 base pairs 

<B) TYPE: nucleic acid 
<C> STRANDBDNESS: double 
(D) TOPOLOGY: linear 

15 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 



30 



35 



40 



45 



50 



ATCTAGCGGT 


ACAAGCGTCT 


TGGAGGCTAG 


TATGTTGAAC 


ATTGTAAACC 


CTRAAfSATPA 


o u 


CTTCGTTGTC 


ATTGTTTCAG 


GTGCCTTTGG 


TAACCGATTT 


AAACAAATTG 


CAGAAACFTA 




TTACAAAAAT 


GTGCATATTT 


ATGACGTAAC 


ATGGGGAGAA 


GCTGTAGATG 


TCAAAGATTT 


180 


CATCAATTTC 


CTTTCAACTT 


TAAATGTTGA 


AGTTAAAGCA 


GTATTTAGTC 


AATATTGCGA 


240 


AACATCTACG 


ACAGTGCTAC 


ACCCTATTCA 


CGAGTTAGGA 


AATGC CATTA 


AT CAATTTAA 


300 


TAGTAATATT 


TATTTTGTAG 


TTGACGGCGT 


AAGTtGCATT 


GGTGCTGTTG 


ATGTTGACAT 


360 


TAACAAAGAT 


AAAATTGATG 


TACTTGTTTC 


TGGTAGTCAA 


AAAGCAATTA 


TGTTACCTCC 


420 


AGGATTAGCT 


TTTGTAGCTT 


ATAGCCACCG 


TGCAAAAGAA 


CATTTCAAAG 


AAGTAACTAC 


480 


GCCAAAATTT 


TATCTAGACT 


TAAATAAATA 


CATTTCGTCA 


CAAGCTGACA 


ATTCTACACC 


540 


GTTCACACCA 


AATGTGTCTT 


TATTTAGAGG 


TGTAAATGCA 


TACGTTGAAA 


CCGTAAAAGC 


600 


AGAAGGTTTC 


AATCACGTAA 


TAGCACGACA 


CTATGCAATT 


AGAAATGCAT 


TAAGAAGCGC 


660 


CTTAAAAGCA 


TTAGATTTAA 


CTTTATTAGT 


CAATGATAAA 


GATGCATCTC 


CAACGGTTAC 


720 


AGCATTCAAA 


CCTAATACAA 


ATGATGAAGT 


GAAAATAATC 


mAAGATGAAC 


TTAAAAATnG 


780 


CTTTAAAATA 


ACAATTGCnG 


GTGGTCAAGG 


CCATCTTAAA 


GGTCAAATTT 


TnAGAATTGG 


840 


TCATATGGGG 


AAAATTAGTC 


CTTTCGATAT 


TTTATCGGrTA 


GTATCTGCTT 


TAGAAATTAT 


900 


TTTAACTGAA 


CACCGTAAAG 


TTAACTATAT 


CGGTAAAGGT 


ATATCAAAAT 


ATATGGAGGT 


960 


TATTCATGAA 


GCAATTTAAT 


GTACTCGTTG 


CAGATCCCAT 


ATCAAAAGAT 


GGTATCAAAG 


1020 


CATTATTAGA 


TCACGAACAA 


TTCAATGTAG 


ATATTCAAAC 


TGGCTTGTCC 


GAAGAAGCAT 


1080 


TAATCAAAAT 


TATACCTTCA 


TACCATGCTT 


TAATCGTTOG 


TAGTCAAACT 


ACGGTTACTG 


1140 


AAAATATCAT 


AAATGCTGCT 


GATTCTTTAA 


AAGTAATCGC 


ACGCGCCGGT 


GTTGGTGTAG 


1200 
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10 



GTAATACGAT TTCAGCTACT GAACATACAC TGGCAATGTT ATTATCAATG GCACGAAATA 1320 

TTCCGCAAGC ACACCAATCA CTTACAAATA AAGAATGGAA TCGAAATGCA TTTAAAGGTA 1380 
CTGAGCTTTA TCATAAAACA TTAGGTGTCA TTGGTGCTGG TAGAATTGGT TTAGGTGTTG ' 1440 

CTAAACGTGC GCAAAGTTTC GGAATGAAAA TACTAGCTTT TGACCCTTAC TTAACGGATG 1500 

AAAAAGCAAA ATCTTTAAGC ATTACGAAGG CAACAGTTGA TGAGATTGCC CAACATTCTG 1560 

ATTTCGTTAC ATTACATACA CCACTAACAC CTAAAACAAA AGGCTTAATT AATGCTGTCT 1620 

TTTTTGCCAA AGCAAAACCT AGTTTGCAAA TAATCAATGT GG CACGTGGT GGTATTATTG 1680 

15 ATGAAAAGGC GCTAATAAAA GCATTAGACG AAGGACAAAT TAGTCGGGCA GCTATCGATG 1740 

TGTTTGAACA TGAACCTGCA ACTGACTCGC CTCTTGTTGC ACATGATAAA ATTATTGTTA 1800 

CACCTCATTT GGGTGCTTCA ACAGTCGAAG CTCAAGAAAA AGTGGCAATT TCTGTTTCAA 1860 

20 ATGAAATCAT CGAAATTTTA ATTGATGGTA CTGTAACGCA TGCAgTGAAT GCACCTAAAA 1920 

TGGACTTAAG CAATATAGAT GATACTGTAA AATCATTCAT CAATTTAAGC CAA 1973 
(2) INFORMATION FOR SEQ ID NO: 36: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7620 base pairs 

(B) TYPE: nucleic acid 
<C> STRANDEDNESS : double 
(D) TOPOLOGY: linear 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

35 GGTGTTTCAG ATGTCACTGG TTGATTTTTA ATTGTAGACG GGTATTTTGG GCTTTCGCCA 60 

TATTTATTTG CCGGCTTACT GTCAAAGCAT AGGAATACTA TCATAACAAT TGTTAGGCCT 120 

AAAT^AACAA AATAAAGAAG TACTAACAAA ATATTAAGAC CCATCGGCAT TAATGTAAAA 180 

40 TCACTGTCAT AATAACTATC GATAATCTGT AATACTATAT AAAATATAAT ACTGAATACT 240 

GTCATAATCA TTGGAAATAA CATTGTTCTT GATATATCGT GAAATCTTCG AACGCACAAC 3 00 

GCTAAATTTG GAATAAACGT TGCCAAACTA TAGACAAAAG TATACACAGA TGTAAGGATA 360 

45 ATCATCAATA TACTCATAAC TATTAATGTT TCGTTATCCG CCGCTATAGA AATAAAGAAT 420 

AGAAATAGGT TTATTATTAG CACACACACA GCTGGAACCA TAAGTATCAA ATGCCATAGT 4 80 

GCCATATACC AATATTCACT ACGTCTTGAT CTCCCCTTAA AATTTACATA ATTTTTCCAA 54 0 

50 

AATAAAACGA ATGATTT CAT AAAACCTACT TGAGGTAATT GTTCCATTGT AATCTCCCTT 600 

TCGTTAATCA TATTTATATT TTTAATTATT GTTACCGTTA TAATTTACAA GATTCATTAT 660 
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GTAAAATGAA AACCCGCTAC AAGTACACAT 
TTCGTTAACT ATACTAAAAA TATGTCATAC 

5 

CTATGCAAAT AAAATATTCC ATAACAAAGT 
ATACTATTTT ATCAAACATT TACCACAATA 
ATCATATAAT TGCGAGGAGA ATATTATGGA 

10 

TAAAAACTTA TTAGGTGTCA AAGTGATTTA 
CATCGTGGAA ACGGAAGCTT ACTTAGGTTT 

T5 TAAAATAACA CCTAAAGTCA CGTCATTATA 

CATGCATACG CATTTACTCA TTAATTTTGT 
ACTTATCCGC GCAATTGAAC CAGAAGAAGG 

20 GAAAGGCTAC GAGGTAACGA ATGGCCCAGG 

GGCTATCGAT GGCGCTACGT TAAATGACTG 
ATATCCTAAA GATATTATTG CTAGTCCACG 
ACATAAATCT TTACGTTACA CAGTG AAAGG 
AGATTGTATG TTTCCCGAAG . ATACTTGGAA 
TGAAAATGAA ATCTATCTCC TTATAAGTCA 

30 

TGATTGTTTT TCTTTGTATC CATCATATTT 
CTTAATTATA AAATATAACA ATAGAATTAT 
TTATTGATAT TATTTTCAAA AACTAGAAAT 

35 

CGCCCTTTTA TAACGCTTAC ATATAAAAGC 
GGATTTGAAA ATGATAGAAC TTAATGCAAT 

40 TTTACTTGGT AAGG CTATCG TTAATCACGT 

ACCAGTGATT GGCGGCTTAA TCTTTGCTAT 
GGTTAAGATT AAATTAGATG CTTCATT CAT 

45 GACAATCGGT CTTGGTGCAT CATTGAAATT 

ATACTTTATG TTTTGTGCTA TCATTTCAGT 
AAAAGTATTA AATATTAAAC CTTTGTTAGG 

SO 

CGGTCATGGT AATGCTGCTG CTTATGGTAA 
ACTGACAGCG GCTCTTGCAG CTGCAACTTT 

£5 



CTATATGGAG ACTCATTTGA AAGTCAACGC 780 

TGCAATGTTC ACGTTTAAAA GAGTCTCAAT 840 

ATATACTTTA CATTTTTATA ATTCTTAACA 900 

AAAATATCTT TTTCATTTTT ATTTAAATTA 960 

TTTCGTTAAT AATGATACAA GACAAATTGC 1020 

TCAGGATACC ACTCAAACGT ATACAGGCTA 1080 

GAATGATCGT GCGG CTCATG GCTATGGCGG 1140 

TAAACGTGGT GGTACAATTT ATGCACATGT 1200 

AACAAAATCT GAAGGTATAC CTGAAGGCGT 1260 

TTTATCCGCT ATGTTCCGTA ACAGAGGTAA 1320 

AAAATGGACT AAGGCATTTA ACATTCCACG 1380 

TAGATTGTCT ATTGATACTA AGAATCGTAA 1440 

AATCGGTATT CCAAATAAAG GTGATTGGAC 1500 

TAATCCATTT GTGTCTCGCA TGCGTAAATC 156 0 

ATAAATGCCA TCTTTCATTG ATTACTATCA 1620 

ATCAATCGTG CCGTCAACAT GCGGATGGGT 168 0 

TTTGATTCAT CTCCTCTTAT TGAACTTGTT 174 0 

TTATAATTAT TAAATTTAGA TGCATTAATA 180 0 

ATTGATTTGT TGCATGTATA ATGTTAAAAG 186 0 

TTATTTAGGG AGAGGGATAT TCAACAAGGG 1920 

TACAACATTA TGTTT AG CTT GTATCCTTTA 1980. 

TAATTTTTTA AAACGTATTT GTATACCAGC 204 0 

TTTAGTTGCG GCTTTGGATT CATTTGGCAT 210 0 

TCAAGATTTC TTCATGTTAG CATTCTTTAC 216 0 

ATTTAAATTA GGTGGCAAAG TCTTGCTATT 2220 

CATTCAAAAC ATAGTTGGTG TATCACTAGC 22 80 

ATTAACAGCA GGTTCCATGT CTATGGAAGG 234 0 

GACAATTCAA GATTTAGGTA TTGATTCGGC 2400 

AGGTCTTGTA TTTGGAGGGC TTATCGGTGG 2460 
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ATTTAAAGAT TATAGCCAAG TAGCATATAA 
TGAAGTATTC TTCATTCAAT TTACAATCGT 

5 

CAGTCATTTG TTTACAGCTC AAACAGGGAT 
TGTAGCTGTT ATTGTCCGAA ATATCTCTGA 
AATTACTAAT CAAATTGGCG ATGTCGCATT 

10 

CATTCAATTA ATCGAAATTT ATAAACTTGC 
AGTTGTCGTT ATGATTTTAT TTGCTGTTTT 

15 TGATGCTGCA GTAATGGTAG GTGGTTTTAT 

ATGGCAAATT TAGATGTTAT TACTAAAAAA 
GTACCTATTG TTGGTGCATT CTTAATCGAT 

20 ATACAATGGT TTAGTTAAAC ACCAAACTCA 

TTTATTTATC CTCGATGTAT ATTCAAGTTA 
CTAAATACGA TTTGTTTTTG TGTTAAGTCG 

25 

ATTGATTTCA TGTGTTCAAT AAATGATTCT 
TTTTCAACTT GATTTAAAAA CGGACGTGAC 
. CTTTTAATTG CATCGAGTGG TGTACGTAAA 

30 

TGATAAGCCG TTGTTTCAAG TAATGACTCA 
TCCATATCTT TATTTG CACG ACGTTCATTT 

3S CCACTAACAT CGACATACTT GACGCCTATT 

ATACCAAATC CAACTTCTTT TATAATGACT 
ATAT^ATCTA ACCAAGTCAC AAATTCACGA 

40 GAATTAACAT GGATTTGTAA CGCTTGTGCC 

TCTACTGGTA CGTCCGCACC AACATTGCTA 
GCAATCGTAA ACGTCTCAGC CATGCGTGGA 

45 GCCATCGCTA AGCCAGTTTC TCTTGCAACT 

CACTCGCTAC CACCCGTCAT TGCATTAATA 
GTCTGTGATG TCAAATCGAT ATCATTTACA 

50 

CG CATCTT AT CAAAATCTGA ATGCATTGCG 
TCATTTTTTC TCTGTTCTCT TTGAAAATCA 

55 





CATAGTAAAT 


TTAATGC'-Av- 


2580 


XGTATTCXGT 


ATGGCAGTTG 


bAAU 1 1 A 1 1 X 


2640 


1 AAlljrl 1L.V-A 


ATTTACGTTG 




2700 


AAGTTTTAAT 


TTTAATATTG 


TAG ATT, J. AAA 


2760 


AGGTATTTTC 


TTATCTCTTG 


CG CTAATGAG 


2820 


TATACCTCTT 


ATTATTATCG 


TTTTAGTTCA 


2 880 


AATTTTATTT 


AGAGGTTTAG 


GAAAAGATTA 


2940 


CGGTCATGGG 


CTTGGTGCAc 


GCCAAATGCC 


3000 


TATGGAAACT 


CACCTAAAGC 


ATATTTAGTT 


3060 


TTAATTGGTG TTAT AGTCAT 


TATGGGATTC 


3120 






CGCCTCcTcT 


3180 


CGTTGTTCTA 


T CCATGACAA 


TATTTCCGGA 


3240 


TCAATATTTT 


TAGCATCTAA 


CATCGTCATT 


3300 


ACATAAGCTA 


CTGTATGTGC 


AATGCCATTA 


3360 


ATACCAGTTG 


CCTTTGCACC 


AAGTG CTAAA 


3420 


CCACCACTCG 


CGAAAACTGA 


AATTTCGCTT 


3480 


ACTGTAGACT 


GTCCCCATGA 


TGATAAGTAA 


3540 


TCAATATCTA 


CAAAGTTAGT 


: ACCACCTTTG 


-3600 


TGTTGTAAGT 


CATGCATTAA 


TTCTTTGCTC 


3660 


GGAACAGACA 


CTCGTGATAC 


AATCGACGCT 


3720 


TTCCCTTCAG 


GCATAACTAA 


TTCTTGAGGA 


3780 


TCAAGTAATT 


CAACTGCTTC 


CAAAGCCTTT 


3840 


AAAATCATGC 


CTTCAGGATT 


CATTTTTCGC 


3900 


TTTCTCAATG 


CCG CATGTGT 


TGATCCAACT 


3960 


ACAGCTAGCT 


TTT CATTGAT 


GTTTTTCGTC - 


4020 


TAAACCGGAT 


ATGCCATCGT 


TAAGTCAGGC 


4080 


TTAATTGATG 


GGATAGAATG 


ATGCACAAAA 


4140' 


TCAGATTGGG 


CCATTGCTAT 


TTCAACATGT 


4200 


CTCATGATTA 


AACCTACCTT 


TTCGTCATTT 


4260 
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ATTACAGCTA AGCAAATATA ATATCCATAA TGTAAATGTA ATGCCGGCAT ATTTACAAAG 4380 

TTCATACCAT AAATCCCAGC TATGAATGTT AACGGTGAAA ATATAACTGA TACTAATGTC 4440 

5 

AGTACTTGCA TAATACTATT CATTCTAAAT GACGTGTATG ACTCAAAATT TTCTCGTATT 4500 

TCGTTTGTCA TTTCTTGAGC AGTACGAATG ATATTACGTT GCTTAATCAA GTGGTCATCG 4560 

1Q ATATGTTGAA TGTATAGGGA ATGTTTATTA TCTATAATCA AATCACCATT TTGTTTCATT 4620 

GTATCAATTA GCTCTTGCAT AGGAAACAGT ACACGTTTTA CTTTAATCAA ATCCGAACGT 4680 

AACTTAAAGA CACTATCCAT GACCATTTTA TTAAAGCGAT CATCTACATG GCGGTCTTCA 474 0 

15 AAATGATAAA CACTATCTTC AAGTGCATAT ACAAAGTTGA AATATTTATC AACCATCATA 4800 

TCTAAAATTA ATATGACGAC ATCTGCACAA TCTAATTCTG CATCTAATGT ATTCATATAC 4 860 

TTATAGACTA CTTTATTTAA TGATTCCAAC GTTTGATGAT GATATGTTAC TAATACATTG 4 920 

p/i -_ - . . 

TCTTGTATAA AAATATTTAG TGCTATTGGT GAATAGTTTG ACCCCATAAT ACTATGGAAT 4980 

ACTAAGTATT GATAATCTTT ATAAGATTTA TATTTAGCTC GTGGCATACC GTTAATTGCA 5040 

TCATCCACTT CTAAATCATT AAAATTAAAA TGTGCTTTAA ACCATTCATT TTCTTGTTCA 5100 
25 . , ' 

TTCGGTTCAT CAAAATCATA CCAAACAATA GTCGCATCTT TTGGTATCTC TTTGATATCA 516 0 

TCAAGTACTT TAAACGGTTC ATATGTAGTT TGATACCGTA TCTTTAAAGC CATCGATACT 5220 

CCCCCTAAAT AACGAATTCT CTATTATTTT ATCATGAATT AAATAACGTG TATGTCTTAA 5280 

30 , 

TTT ATTTTAG TATGATAGTC ACTAAGGAGA TGGTTATTAT CAAACAACTT TTTACACATA 534 0 

CTCAAACCGT AACATCTGAA TTCATTGACC ATAACAATCA TATGCATGAT . GCAAATTATA 5400 

3S ATATCATTTT TAGTGACGTC GTGAATCGTT TTAATTACAG CCACGGTCTT TCTTTAAAAG 54 6 0 

AACGCGAAAA TTTAGCATAT ACGCTATTTA CACTAGAAGA ACATACGACA TACCTCTCAG 5520 

AATTGTCTCT TGGCGATGTA TTTACTGTTA CTTTATATAT TTATGATTAC GATTATAAGC 5580 

40 GGTTGCATTT ATTTTTAACA TTAACTAAAG AAGATGGTAC ACTAGCATCA ACAAATGAAG 5640 

TAATGATGAT GGGAATTAAT CAGCACACAC GTCGTTCTGA TGCTTTTCCT GAATCATTTT 5700 

CAACACAAAT AGGACACTAT TATAAAAATC AATCAACTAT CACTTGGCCT GAACAATTAG 5760 

45 

GACATAAAAT AGCAATTCCA CACAAAGGAG CATTAAAATG ACAGATGCAT TACAACAAAA 5820 

GATTCATATC GAATTACTAG ATTTATTAGA TGATGTTAAG TTTGAATTAA GAGAATT AAA 58 80 

TGCACAAAAA GGGTTATACA TTAAGGGACC AGCAAATCAG CTACTTAAGC GTGGCGTGGA 594 0 

50 

TATGGCTTAT GTTCAAGGAC AAAAGCAAGC CATCGATAAT ATTATGACTA TTGTGGAACA 6000 

ACAGCTTGAA AGATCAACAT TTCCTAGAAC ATTATGATAA ATTTCAAAAT GAGGTTG CTC 6060 
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ATAATTTTTT AGATCAATTT TATCAAATTA AAGGGCAATA CTTTATCATC ACACATATCA 6180 

ATACACTTAT TGGTGATTTT CACTCAGAAG CTCATTAACA ATTAGTCTAT ATAACCCTTG 6240 

5 

CTATATTTTC AAAAACAAAA CCCAATTACG TTTTCATGTC AAATATCATC TTGCATGAAA 6300 

TCGTAACTGG GTCATTTATA TGTTATTAGT TATTTTGTGT TACATCCTCA TCTATCGATT 6360 

TGGCAATTTG TTTAATAGCT TTATGTGATT GTCTAATTGG ATAAATTGGA AAATCATGTA 6420 

CCATCTTAGG ATAATCATAA AACTCAATGT ATTGATGATG TTGCAACATC ATTTGTTCAA 64 80 

ATAGCTTCAT ATCAGGATGT GTCATTTCAC GTCCACCACC AAACATATAA ACTGGTGGCA 6540 

15 ATCCTTCTAT TGTGCCATTA ATTGGCG AT A TGCGCTTATC TGTTAATGGT AGGCCATTCG 6600 

CCCATTTTTT CATAATCTCA TTGACACCAA ACTGACTTAG aACCGCATCT TGTTCGATTA 6660 

AGGCGTCCGA AATATCTTTA TTAGATAGTG TTGCATCTAA AATTGGTGAG ATTAAATACA 6720 

20 ATTTATTCGG TAATGGCTGT TGATTAkCTA AAAGAGATTG TACAAAGGAT AATGCCAGTG 6780 

CACCACCTGA ACCATCACCC ATGAGTAOGA CATTTTGATG TCCTACTTCA GATACTAATT 6840 

GaTCATAAAC ACGTTGTATC GCTTGGnAAA GTATCGTCaA TATGnAAACT CTGGTGTCTT 6900 

25 

TGGATAGATA GGCAGTACAA CCTCATATAA TGtACTTAAA GTGATTTTAT CCCAACAATC 6960 

TCCAATGGAA CGGTGATGGT TGTAGTGCAT TGAATCCACC GTGAATATAT AAAATTTTCT 7020 

TATCAATTTG ATGTCTGAAA TTAAAGCGAA AGACTTGCAT ATCATCTAAT GACAATTTTT 7080 

30 • 

CTAAATTTGC TTTAACATTT AATGTTGAAG GCTGCTTATG TTTTTTTCTA TTTTCAATTT 7140 

CT CTTTT AT A AAAAAATCTT TCAACATCTT GATCATTTTT AAACATAATC GAGCGATTGT 72 00 

3S GAAGCAAATA TTTATTGACA ACGCTATTCA TAACACGGTT TCTAATCAAT GTCTTAACCT 7260 

ACCTTTATAT ATTTTATGTA TCCAATGATk GTCTATCCCC TACATTCTTT GCCAAAAAAA 73 2 0 

GTATATAATG TAGAAGATAT TTTCTTTTTC ACTTTCAAAT TTAAGACTAC AATTGAACAG 73 80 

40 TGATTTTTCA TCATTATAAC AGACAACTAG ACATATTGAT AAGTAAAGAA AAGAACTTTA 744 0 

TACGGAGGTA CCTTGCATGA CAAATCCAAA TCAACGATTA GAACCATTTG ATGAGACATT 7500 

TCAACAACCG AAT ATT CATC GTGGTAAGCG ATATGGTAAG AAAAAACGTT CATTGGTAAG 7560 

45 CATGATTATT CAAATCATTG TTGTwATATT AACCACCATC GCTGGAATAC AGCATGGTGG 762 0 

(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 
50 (A) LENGTH: 9834 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

55 
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(xi) 


SEQUENCE DESCRIPTION: 


SEQ ID NO: 


37: 








GTCATtACCG 


amTTTC t TAG 


AaTCATTTAA 


AGATGATAAA 


TATACAAACG 


TTGGTAATTT 


60 


5 


AAAAGAAGTG 


AATTTTGATA 


AAATTGCTGC 


GACGAAACCC 


GAAGTAATCT 


TTATCTCTGG 


120 




AGGTACAGCT 


AATCAAAAGA 


ATTTAGATGA 


ATTCAAAAAA 


GCTGCACCTA 


AAGCGAAAAT 


180 


10 


TGTTTATGTT 


GGTGCAGATG 


AAAAGAACTT 


AATTGGTTCA 


ATGAAACAAA 


ACACTGAAAA 


240 


TATCGGAAAA 


ATTTACGATA 


AAGAAGATAA 


AGCTAAAGAA 


TTAAATAAAG 


ATTTAGATAA 


300 




CAAAATTGCT 


TCAATGAAAG 


ATAAAACGAA 


AAACTTCAAT 


AAAACTGTTA 


TGTATTTACT 


360 


15 


AGXTAACGAA 


GGTGAATTAT 


CAACATTTGG 


ACCTAAAGGT 


CGTTTTGGTG 


GATTAGTTTA 


420 




CG AT ACATT A 


GGATTCAATG 


CAGTTGATAA 


AAAAGTAAGT 


AATAGCAATC 


ATGGACAAAA 


480 




TGTTTCTAAC 


GAATATGTTA 


ATAAAGAAAA 


TCCAGATGTT 


ATTTTAGCGA 


TGGATAGAGG 


540 


20 


TCAAGCGATA 


AGTGGTAAAT 


CAACTGCGAA 


ACAAGCATTA 


AATAATCCTG 


TATTAAAAAA 


600 




TGTTAAAGCA 


ATTAAAGAAG 


ACAAAGTATA 


TAATTTAGAT 


CCTAAATTAT 


GGTACTTTGC 


660 




AGCTGGATCA 


ACTACAACTA 


CAATTAAACA 


AATTGAGGAA 


CTTGATAAAG 


TTQTAAAATA 


720 


25 


ATTTTAAAAG 


AGGGGAACAA 


TGGTTAAAGG 


TCTTAATCAT 


TGCTCCCCTC 


TTTTCTTTAA 


780 




AAAAGGAAAT 


CTGGGACGTC 


AATCAATGTC 


CTAGACTCTA 


AAATGTTCTG 


TTGTCAGTCG 


840 




TTGGTTGAAT 


GAACATGTAC 


TTGTAACAAG 


TTCATTTCAA 


TACTAGTGGG 


CTCCAAACAT 


900 


30 


AGAGAAATTT 


GATTTTCAAT 


TTCTACTGAC 


AATGCAAGTT 


GGCGGGGCCC 


AAACATAGAG 


960 




AATTTCAAAA 


AGGAATTCTA 


CAGAAGTGGT 


G CTTT AT CAT 


GTCTGACCCA 


CTCCCTATAA 


1020 


35 


TGTTTTGACT 


ATGTTGTTTA 


AATTTCAAAA 


TAAATATGAT 


AGTGATATTT 


ACAGCGATTG 


1080 


TTAAACCGAG 


ATTGGCAATT 


TGGACAACGC 


TCTACCATCA 


TAT ATT CATT 


GATTGTTAAT 


1140 




TCGTQTTTGC 


ATACACCGCA 


TAAGATTGCT 


TTTTCGTTAA 


ATGAAGGCTC 


AGACCAACGC 


1200 


40 


TTAATGGCGT 


GCTTTTCAAA 


CTCATTATGG 


CACTTATAGC 


ATGGATAGTA 


TTTATTACAA 


1260 




CATTTAAATT 


TAATAGCAAT 


AATATCTTCT 


TCGGTAAAAT AATGGCGACA, SCgTGTTTCA 


1320 




GTATCGATTA 


ATGAACCATA 


AACTTTAGGC 


ATAGACAAAG 


CTCCTTAACT 


TACGATTCCT 


1380 


45 


TTGGATGTTC 


ACCAATAATG 


CGAACTTCAC 


GATTTAATTC 


AATGCCAAAT 


TTTTCTTTGA 


1440 




GGGTcrrrrG 


TACATAATGA 


ATAAGGTTTT 


CATAATCTGT 


AGCAGTTCCA 


TTGTCTACAT 


1500 




TTACCATAAA 


ACCAGCGTGT 


TTGGTTGAAA 


CTTCAACGCC GCCAATACGG 


TGACCTTGCA 


1560 


SO 


AATTAGAATC 


TTGfATCAAT 


TTACCTGCAA 


AATGACCAGG 


CGGTCTTTGG 


AATACACTAC 


1620 




CACATGAAGG 


ATACTCTAAA 


GGTTGTTTAG 


ATTCTCTACG 


TTCTGTTAAA 


TCATCCATTT 


1680 
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AGTGTTCTTT TTGAATAATG CTATTACGAT AATCTAACTC TAATTCTTTT GTTGTAAGTT 1800 

TAATTAACGA GCCTTGTTCG TTTACGCAAA GCGCATAGTC TATACAATCT TTAACTTCGC 1860 

CACCATAAGC GCCAGCATTC ATATACACTG CACCACCAAT TGAACCTGGA ATACCACATG 192 0 

CAAATTCAAG GCCAGTAAGT GCGTAATCAC GAGCAACACG TGAGACATCA ATAATTGCAG 1980 

CGCCGCTACC GGCTATTATC GCATCATCAG ATACTTCGAT ATGATCTAGT GATAATAAAC 204 0 

TAATTACAAT ACCGCGAATA CCACCTTCAC GGATAATAAT ATTTGAGCCA TTTCCTAAAT 2100 

ATGTAACAGG AATCTCATTT TGaT AGGCAT ATTTAACAAC TGCTTGTACT TCTTCATTTT 2160 

TAGTAGGGGT AATGTAAAAG TCGGCATTAC CACCTGTTTT AGTATAAGTG TATCGTTTTA 2220 

AAGGTTCATC AACTTTAATT TTTTCATTTG GGATAAGTTG TTGTAAAGCT TGATAGATGT 2280 

CTTTATTTAT CACTTCTCAG TACATCCTTT CTCATGTCTT. TAATATGATA TAGTATTATA 234 0 
2 0 CCAATTTTAA AATTCATTTG CGAAAATTGA AAAGAAAGTA TTAGAATTAG TATAATTATA - 2400 

AAATACGGCA TTATTGTCGT TATAAGTATT TTTTACATAG TTTTTCAAAG TATTGTTGCT 2460 

TTTGCATCTC ATATTGTCTA ATTGTTAAGC TATGTTGCAA TATTTGGTGT TTTTTTGTAT 2520 

2S TGAATTGCAA AGCAATATCA TCATTAGTTG ATAAGAGGTA ATCAAGTGCA AGATAAGATT 2580 

CAAATGTTTG GGTATTCATT TGAATGATAT GTAGACG CAC CTGTTGTTTT AGTTCATGAA 264 0 

AATTGTTAAA CTTCGCCATC ATAACTTTCT TAGTATATTT ATGATGCAAA CGATAAAACC 270 0, 
30 CTACATAATT TAAGCGTTTT TCATCTAAGG ATGTAATATC ATGCAAATTT TCTACACCTA 276 0 

CTAAAATATC TAAAATTGGC TCTGTTGAAT ATTTAAAATG aTGc tACCGC CAATATGTTT . 2820 

TGTATATTTT ACTGGGCTGT CTAAGAGGTT GAATAATAAT GATTCAATTT CAGTGTATTG 288 0 

35 

TGATTGAAAA CAATTAGTTA AATCACTATT AATGAATGGT TGAACATTTG AATACATGAT 2 94 0 

AAACTcCTTT GATATTGAAA ATTAATTTAA TCACGATAAA GTCTGGAATA CTATAACATA - 3000 

ATTCATTTTC ATAATAAACA TGTTTTTGTA TAATGAATCT GTTAAGGAGT GCAATCATGA 3060 

40 

AAAAAATTGT TATTATCGCT GTTTTAGCGA TTTTATTTGT AGTAATAAGT GCTTGTGGTA 3120 

AT AAAGAAAA AGAGGCACAA CATCAATTTA CTAAGCAATT TAAAGATGTT GAGCAAAAAC 3180 

45 AAAAAGAATT ACAACATGTC ATGGATAATA TACATTTGAA AGAAATTGAT CATCTAAGTA 324 0 

AAACTGATAC AACTGATAAA AATAGTAAAG AATTTAAGGC ACTACAAGAA GATGTTAAAA 33 00 

ACCATCTCAT ACCTAAATTT GAAGCATATT ATAAGTCAGC AAAAAATTTG CCTGATGATA 33 60 

50 CAATG AAAGT TAAGAAATTA AAAAAAGAAT ATATGACGCT TGCAAATGAG AAGAAGGATG 34 20 

CGATATATCA ATTAAAAAAA TTCATAGGTT TATGTAATCA ATCTATCAAG TATAACGAAG .3480 
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AATTAGCTGA TAATAAAAGT GAAGCAACTA ATCTTACGAC AAAATTAGAA CATAATAATA 3600 

AAGCGTTAAG AGATACTGCG AAGAAGAACC TAGATGATAG TAAAGAAAAT GAAGTAAAAG 3660 

GCGCGATTAA AAATCACATT ATGCCAATGA TTGAAAAGCA AATTACCGAT ATTAACCAAA 3720 

CTAATATTAG TGATAAGCAT GTTAATAATG CAAGGAAAAA CG CAATAGAA ATGTATTACA 3780 

GTCTGCAGAA CTATTATAAT ACACGTATTG AAACAATAAA GGTTAGTGAG AAGTTATCAm 3 84 0 

AAGTCGATGT AGATAAGTTG CCGAAAAAGG GTATAGATAT AACTGACGGC GATAAAGCCT 3 900 

TTGAAAAAAA GGTTGAAAAA TTAGAAGAAA AATAACTATA ATCATTTTTC AAAGTTAAAA 3960 

ATTTTGAATT TATGGTTAAC ATGTCAACTT ACTATGTGTA TAATGGTAAA CATTGATATT 4020 

AACTATATGT ATAAAAATGT CACGCAGATG CTATTTAAAT GTGATAAATA TTTTTAGAGG 4080 

TGAATAGAGT GGCTATAAAG CTAAGTTCAA TTGACCAATT TGAACAGGTT ATTGAGGAAA 414 0 

2Q ATAAATATGT TTTTGTATTA AAACATAGTG AAACTTGTCC AATATCGGCA AATGCGTACG 4200 

ATCAATTTAA TAAATTTTTA TATGAACGCG ATATGGACGG TTATTATTTG ATTGTCCAAC 4260 

AAGAACGCGA TTTGTCAGAT TATATTGCTA AAAAAACGAA CGTTAAACAT GAATCACCTC 4320 

25 AAGCATTTTA TTTTGTAAAT GGTGAAATGG TTTGGAATCG AGACCACGGT GATATCAATG 43 8 0 

TGTCGTCATT AGCACAAGCA GAAGAATAAT GAAACTATAG GGTTGGAACA TTTTGCCTTA 444 0 

CACTAGTAGA CGTGAATAGC ACAACTTAAA TTCGTGTGAA TCAGAGTAGT TTGGCTATAA 4500 

30 , - 

TGATGTTCTG ACCTTTTATT TTATGTCACC TTTAGAAGCA GTTAAGTTAG TACTTTTTTA 456 0 

CAAACATATG TATAATATAT TCGAGTATTT TTATTGAAAa tATTTTGGAA AACGACGAAT 4620 

CCAATAAGAA AATTTAAACA TGATTTGTAA GTTAGTTTAA TAGGAAATAT ATGCTAAACC 4680 

AAAAGAAGCA TATTGTTATT TACTGGAATA ATTAATAATC ATGTCATGTT AAATGTTAGC 474 0 

ATATAATCAC GAGATAAAAT CTAAAATTTA AGATTAATCT TTTATGAATA AAAAACGTAT 4800 

CACAACAAAT AATAAAGTAA GGTGGTCAAG GTTATGAAAG TATTAGTAGC CATGGATGAG 4860 

TTTCATGGAA TTATTTCAAG TTATCAAGCT AATAGATATG TTGAAGAGGC AGTTGCAAGC 4920 

CAAATTGAAA CTGCAGATGT AGTTCAAGTA CCATTGTTTA ATGGAAGACA TGAATTATTA 4980' 

GATTCTGTAT TTTTATGGCTn ATCTGGGcaA AAGTATCGTA TAGCAGTACA TGATG CAGAT 5040 

ATGAATGAAG TTGAAGGTGt TTACGGACAA ACTGATACAG GGATGACCGT TATCGAGGGG 5100 

AATTTATTTT TAAAAGGTAA AAAACCAATT GTTGAACGAA CAAGTTATGG TTTAGGAGAA 5160 

ATGATTAAAC ATGCATTAGA TAACGACGCA AAACATGTTG TAATTTCACT AGGTGGGATT 5220 

GATAGTTTTG ATGCTGGTGC AGGTATGTTA CAAG CATTAG GTG CTCAATT CTATGATGAC 52 80 
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GATATGTCGA ACTTACACCC TAAAATGGAA ACAGCAAGAA TTCAAGTAAT GTCGGATTTT 5400 

TCAAGTCGAT TATATGGTAA GCAAAGTGAA ATCATGCAAA CTTATGATGC GCATCAGTTG 5460 

5 AATCATAATC AAGCAGCAGA AATCGATAAT TTAATTTGGT ATTTTAGTGA GTTATTTAAA 5520 

AGTGAATTGA AAATTGCAAT TGGTCCAGTT GAACGTGGTG GTGCTGGTGG TGGAATTGCA 5580 

GCAGTCTTGA ATGGACTGTA TCAAGCTGAA ATATTAACCA GTCATGCATT AGTAGACCAA 564 0 

70 

CTAACACATT TAGAAAATTT AGTTGAACAA GCGGATTTAA TTATTTTTGG AGAAGGATTA 5700 

AATGAAAATG ATCAGTTGCT AGAAACGACA ACATTGCGTA TTGCAGAACT TTGTCATAAA 5760 

CATCAAAAGG TTGCCATTGC AATTTGTGCA ACTGCTGAAA AGTTTGATTT ATTTGAATCA 5820 

75 

CAAGGGGTTA CAGCAATGTT TAATACATTT ATCGATATGC CAGAAACTTA TACTGACTTT 5880 

AAAATGGGtT ACAAATTAGG CATTATACGG TTCAGTCTTT AAAACTGTTG AAAACACATT 5940 

2Q TTAATGTTGA GGTTTAGTAA AGAAGGACTA AATTGGTGAT GCTGTCATGA TGGTTAATAA 6000 

CATTTATGAT GGTT AG CAAA ACGAATTAGA AGATCGAAAG TATACGTAAA AAATATGAAA 6060 

AATCACGCTA TCATTGCACT GAATGTTAGC GTGATTTTTA TATATTAATT AAGCCTGAGT 612 0 

25 TGAACTAGTA TATAATCGTT GGTTTTTAGT GA T TTTCAGC GATATCTTCT ACAATTCCAA 6180 

TGATTACTTG TACTGCTTTT TCCaTAACAT CAATGGATGC aTATTCATAT GGGCCGTGGA 6240 

AGTTAC CGCA ACCTGTAAAG ATGTTTGGAG TTGGTAACCC CATAAATGAC AATTGTGAAC - 6300 

30 CATCTGTACC ACCGGGAATA GGTT CAGTGT TTGCTGGAAT ATCTAATTTG GCAAAGACAC ' 6360 v 

GTTTAGGTAT ATCAATAATA TGAGGCAATG GTAATATTTT TTCTGCCATA TTGAAATATT 6420 

GATCCGATAT ATCAACTTTA ACTGGATAAT TTTCAAAATG GGCATTGATA TCGTCACGTA 64 80 

35 

TTTCTAAAAT ACGTTTCTTA CGCAATTCGA ATTGTTTTTT ATCATGATCA : CGAATAATGT 6540 

ATTGCAAAGT TGCTTTTTCA ACAGTTC CTT CAAAGTTCAT TAAGTGATAA AAGCCTTCGT 6600 

ATCCTTCTGT TCGCTCCGGA ACTTCACTAT CAGGTAGCAA ACTATCGAAT TGTTGACCTA 6660 

40 

AACGTATTGC GTTTACCATT GCATTTTTAG CTGAACCAGG ATGAACATTT ACACCGTGGC 6720 

ATGTAATAAC CGCTTCAGCA GCGTTAAAGC TTTCATATTG TAATTCTCCA TATTGACTAC 6780 

4S CATCCATAGT ATAAGCAAAA TCAGCATTGA AGCGGTCAAC ATCAAATTTA TGTGGACCAC 6840 

GACCGATTTC TTCGTCTGGT GTAAATCCAA TGCGAATGGT. ACCATGTTTA ATTTCTGGAT 6900 

GTTCTTGTAA ATAACAAATA GCTTCCATAA TTTCCACAAT ACCCGCTTTA TCGTCTGCAC 6960 

50 CTAGTAACGA TGTACCATCA GTTACCATTA ATGTATGACC AACTAAACTG TTAAGTTCTG 7020 

GAAATACTTT AGGATCTAAG ACACGTTTAG TATTGCCTAG TTTGTATGGC TTACCATCAT 70.80 
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GCGCCAAAAA TCCAACTGTT GGGACGTCGA CATCGATGTT ACTTTCTAAT GTAGCAAATA 7200 

AGTAGCCATT TTCATCTAAA TCAGTTGGCA ATCGTAATTG TTGTAATTCT TTTTCTAATA 7260 

AATGTAACAA ATCCCATTGC TTTTCAGTTG AAGGTGTTGT TGTAGATTTT GGATCAGATT 7320 

GCGTATCAAT TGTCGTATAT CTTGTTAATC TATCTATCAA TTGGTTCTTC ATTATATTCG 7380 

ACCCCTTAAA CT CT ATT ATT CATGTTGTAA GATTTTTTAT ATGTCTTACC TTTGATTTTA 7440 

CCATACAGTT GTTTGATACG TGTGTATAGG TAATATAGAA TTTCAGAAAC TAATATACCG 7500 

AAAGCAATCG CACCTGAAAT CAGTGTAcTT CTAAAAATGT ATTTACAGCA CTTGTATAAT 7560 

CATTTGATAC TAAAAAACGA GTCGCTTGAT AAGCTGCACC ACCAGGTACT AATGGTATAA 7620 

TGCCTGGCAC TATGAATATA ATTACCGGTC GTTTATATCT GCGACTCATA GTATGACTCA 7680 

TTAAGCCTAA AATTAAGCTT CCCAAAAATG AAGCGCCAAC TTTTCCAAAC TCTAAATCTA 7740 

CCGTTAATTG GTAAATCGTC CATGCAATGG CACCCACAAA TCCACATGCT ACTAAGAGGC 7800 

GTTTGGGTGC ATTGAAAATG ATAGAGAAAA GTACTGTTGA TATAAAGCTG ATTGTAAAAT 7860 

GAAATAAATA AAATAGCATG CTTTAACAGT CCTTCCTTAA ATGATTAATA AAACGATTGC 7920 

GACAiCCAGCA CCGATTGCGA ATGCTGTTAA TGCAGCTTCA ACACCX5CGAG ACATACCTGC 7980 

AAGTAATTCA CCCGCTAATA AATGTCGAAT GGCATTGGTA ATTAATATAC CAGGGACAAG 8040 

TGGCATGACA CTGGCTATAG TAATGATATC TTGATTGGTT GCAATGCCTA ATTTAGTAAA 8100 

TGTGGCTGCA ATGGATATGA CCACAGCGGC TGCAACAAAC TCTGAGAAAA ATTTAATTTG 8160 

TATATAGCGT t GCACAAAG C TGAATGTTAA AAATGCGGAT CCGCGAGCAA TGACTGCAAT 8220 

CCAACAATCT GATGCGACAC CACCAAACAT AAATAGGAAG AAGCCACATG CAATGGCAGC 8280 

TGCAAAGAAA TTCGTTAAAA AAGAATATTG TAATGATGCA TGCTGTAAAT GAATAAATTC 8340 

AGATTTAGCT TCATCAATTG TGAGTTCTTT ATTTGATATT TTACGTGAAA GACTATTCGT 84 00. 

T AAAGGGATT TTCTCTAAAT CTGTTGTACG CTCTTGTACA CGAATTAATC TTGTACTTGT 84 60 

TCGATCGTTT AATGAAAAAA TAATTGCAGT TGAACTGACA AAACTATATG TATTATGAAG 8520 

ACCATAACTA TGTGCGATAC GGTTCATTGT ATCTTCAACT CGATATGTTT CAGCACCTGA 8580 

TTCaAGTAAA ATTCTACCTG CAATTAATAC AACATCAATC ACTTTGTTTT CAT CT ATAAT 8640 

TGTGATTGAA TCTGGCATAT CAATTCACCT CCAATGATAT GTGTTATTTA TTTGAACAAT 8700 

TGaAGTTTAC AACTTGTTGT TACAACTTTC AATAGTGAGA CTTTGTGTTA GTATGATGAA 8760 

CTTGTATGGT TCAAATTTAA ATAAGAAAAA CTGTTAATCT TTGCTATTAT ACTATGATTT 8820 

AAT AAT AG CA AAGGATTAAC AGTTTTGTCG TTGTTATAAA TTGATAATAG GGTTAAACAT 88 80 
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TTTACGCTGT GATTTTGGAT CGTCATCTGT TAAATAACCA ACACCGATAG ACACTGACAA 9000 

TTTAATAACT TCTTTGTTTG GTAAATGGAA TGATGATTTT TCAACACCCG AACGAATATT 9060 

TTCAGCTAAT TTAAGACTTT GATCAAGTGA ATAATTGTGA ATGACAACTG AGAACTCTTC 9120 

GCCACCATTT CTAAAAATTT TAAATTGATT CGGCACATAG TTTTTAAGTA ATTGAGACAT 9180 

TTGTTTTAAT ACAGCATCAC CTGATTTGTG TGAGTAGGTA TCATTGaCAT CTTTAAATCC 924 0 

ATCGATATCG ATTAATAATA ATGCGATACT TTGATGTTCT TTTTCAGCTT TTCGTGAAAT 9300 

TTCATTTAAA TGTCTATCAA ATTCTTTTAC ATTACCTAAG CCTGTTAAGT AATCATATTT 9360 

ATCTTCGTTT TCATAACGAT TTACGAGTGA GAAGAAATGC CAAATATCGA CAAATGTTAT 9420 

CGCTGAAGCT AAAGTGATAA TTAATGAAAT TGGTATTAAA ATGATAACTT CCGATAGTGT 9480 

GTAAATAGGA CTCACTAACG CGACACCAAA TAAAATGATT ATTGTAACAA CATTAAGTAT 9540 

20 TAATAATGAT AGCACATCAT TTTGTTTTAA AAATGGTCCA ATAGCACTTG TTACTGCAGC 9600 

AATAACAATC AACGTAACAC CGTACATAAT CGAGTTGTTA AATACTACAA TTTCAACAAT 9660 

TGCTACAATT ACTGTGGCAG ATAATGTATA GACCATATTT GTAAATCTAC CTAAAAACAA 9720 

25 TAAAGGAACG AATGTTAAGT GAATTAAATA ATCTTCACGA TAAGGGATAG GGTAGACAGA 9780 

TAATAATAAT GATACGATTG TCATTAAAAC AGTGACATAA GCCTTAGAAA AAAC 9834 
(2) INFORMATION FOR SEQ ID NO: 36: 



10 



15 



30 



35 



40 



"{!). SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2343 9 base pairs 
' (B) TYPE: nucleic acid 
CC) STRANDEDNESS : double 
(D) TOPOLOGY: linear 



^{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

TCTCAATCAG ATGAAAAATT GCATATCGTA GGTTTTACAG AAAGTGCAAA ATATAATGCG 60 

TCATCAGTCA TTTTCACGAA TGACGCTACC ATTGCCAAGA TCAATCCTAG ATTGACTGGA 120 

GATAAAATTA ATGCAGTTGT TGTACGTGAT ACAAATTGGA AAGACAAAAA ATTAAACCAA 180 

45 GAGCTTGAAG CGGTAAGTAT- TAATGACTTT ATTGAAAATT TACCAGGTTA TAAACCACAG^ 24 0 

AACTTAACAT TAAACTTTAT GATTTCATTC TTATTTGTCA TTTCAGCTAC AGTTATAGGC 3 00 

ATTTTCCTAT ATGTCATGAC ATTACAAAAG ACGAGTTTAT TTGG CAT ATT AAAAGCTCAA 360 

50 GGATTTACGA ATGGCTATTT GGCGAATGTG GTAATTTCGC AGACGGTCAT ATTAGCACTA 420 

TTTGGTACGG CATTTGGCTT ACTGTTAACA GGCGTTACAG GTGCATTTTT ACCTGATGCA 480 
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TCTGTATTAG 


GAAGTTTATT 


CTCCATTTTA 


ACAATTAGAA 


AAATAGATCC 


GTTAAAGGCG 


600 




ATTGGGTAGG 


AGGTGTAGCA 


AATGTTGAAA 


TTTGAAAATG 


TAACAAAGTC 


ATTTAAAGAT 


660 


5 


GGGAATCGTA 


ACATTGAAGC 


GGTTAAAGAT 


ACAAATTTTG 


AGATAAATAA 


AGGTGATATT 


720 




ATAGCATTGG 


TTGGACCTTC 


TGGCTCTGGT 


AAAAGTACAT 


TTCTAACTAT 


GGCAGGTGCT 


780 


10 


TTACAAACAC 


CGACATCTGG 


GCACATTTTA 


ATCAATAACC 


AAGATATTAC 


GACAATGAAG 


840 


CAAAAAGCAT 


TGGCAAAAGT 


TAGAATGTCT 


GAAATAGGTT 


TTATTTTACA 


AGCTACAAAC 


900 




CTTGTACCAT 


TTTTAACGGT 


AAAGCAACAA 


TTTACATTAT 


TGAAAAAGAA 


AAATAAGAAT 


960 


15 


GTTATGTCTA 


ATGAAGACTA 


TCAGCAACTT 


ATGTCACAAT 


TAGGTCTAAC 


TTCATTGCTT 


1020 


AATAAGTTAC 


CTTCAGAAAT 


TTCAGGTGGT 


CAGAAACAAC GTGTGGCGAT AgCaAAGCGT 


1080 




TATATACGAA 


TCCGTCGATT 


ATTTTAGCGG 


ATGAACGTAG 


CGCGGCGTTA 


GATACTGAAA 


1140 


20 


ATGCGATTGA 


AGTCATTAAA 


ATTCTACGTG 


ATCAAGCCAA 


ACAAAGAAAG 


AAAGCATGTA 


1200 




TTATTGTTAC 


ACATGATGAA 


CGACTTAAAG 


CATATTGTGA 


TCGTTCATAT 


CATATGAAAG 


1260 




ATGGCGTCCT 


TAATCTTGAA 


AATGAAACAG 


TAGAATAGTT 


TTATTAAGCC 


GGTACATCAT 


1320 


25 


GTGCCGGTAT 


TTTTATGTTT 


ATGTATTATT 


TGAATAAACT 


TTCACATTCA 


ATTAATAATA 


1380 




ATTATTATCG 


AAAATCAGAA 


ATATTCCGTG 


AAATATAATA 


TTTTTTGTAG 


TAAAATGGCC 


1440 




TCTAAGTATT 


CAATATTTAA 


ATATGGGGAT 


TGAATATAAA 


ATTATCGTAA 


TGGGGGTCAA 


1500 


30 


, TGGTTATGGA 


TTTATTGATA 


GGTACTTTAT 


TTTTATTTTT 


GGTCTTAGTG 


ATTTTTACAT 


1560 




TATTTACATA 


TAAAGCGCCT 


AATGGTATGC 


GTGCCATGGG 


AGCATTAGCT 


AATGCAGCAA 


1620 


35 


TCGCAACATT 


TTTAGTGGAA 


GCATTTAATA 


AATATGTTGG 


TGGCGAAGTA 


TTCGGTATTA 


1680 


AATTTTTAGA 


AGAGCTAGGA 


GACGCTGCGG 


GAGGTCTAGG 


TGGTGTCGCT 


GCCGCTGGAT 


1740 




TAACAGCATT 


AGCTATCGGT 


GTGTCACCAG 


TATATGCATT 


AGTTATAGCA 


GCCGCGTGCG 


1800 


AO 


GTGGTATGGA 


TTTATTACCA 


GGTTTCTTTG 


CGGGTTATAT 


GATTGGATAT 


GTGATGAAAT 


1860 


ATACAGAGAA 


ATATGTGCCG 


GATGGTGTCG 


ACTTAATTGG 


ATCGATTGTC 


ATCTTAGCGC 


1920 




CATTAGCTCG 


TCTTATTGCA 


GTATTATTAA 


CGCCAGTAGT 


GAATAGTACA 


TTGATTCGAA 


1980 


45 


TTGGTGATAT 


TATCCAAAGT 


AGTACGAATA 


CGAATCCAAT 


TATCATGGGT 


ATCATTTTAG 


2040 




GTGGTATTAT 


TACGGTTGTC 


GGCACAGCGC 


CATTGAGTTC 


AATGGCATTG 


ACAGCATTAT 


2100 




TAGGTTTAAC 


GGGTGTAGCT 


ATGGCTATTG 


GTG CCATGGC 


AGCATTTAGT 


TCGGCATTTA 


2160 


SO 


TGAATGGGAC 


GCTATTCCAT 


CGCTTAAAAT 


TAGGTGATCG 


TAAGTCTACG 


ATTGCAGTAA 


2220 




GTATTGAACC 


TTTATCACAA 


GCAGATATTG 


TATCAGCCAA 


TCCAATTCCA 


ATCTATATTA 


2280 
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ATGCGACAGG 


TACAGCTACA 


CCGATTGCAG 


GATTTTTAGT 


TATGTTTGGA 


TTTAATCATC 


.2400 




CGACGACAAT 


TGTGATTTAT 


GGTGTAGTAA 


TGGCGATTGT 


AGGTGCGCTT 


GCAGGTTATC 


2460 


5 


TTGGTTCAAT 


TGTATTTAAA AAATATCCAA 


TTGTTACTAA 


GCAAGACATG 


ATTAATCGAG 


2520 




GTGCAGTAGA 


CGCATAGCAT 


CATCATATTG 


AATAGTAAAA 


ACAAATAAAA 


CATAGTAACG 


2580 




TGATTCAGTC 


GATGTAACAG 


TCGATAATGA 


GTCACGTTTT 


TTTATAGAAA 


AATACAAGAC 


2640 


10 


ATAAAAATGT 


CATAATTTAT 


TGTCGACAAA 


TATCATACTG 


TATAAACATT 


TATCATTTTC 


2700 




TCAAGTACCT 


TTTACACGAT 


GGAATGAACT 


TACTTTTTAC 


GAAATTATGC 


GTATTTTATA 


2760 


15 


AACAAATATC 


ATTGATATAA 


CGGTAAATGT 


AAGCGTTTAC 


AACAGAAATA 


ACAGCATGCT 


2820 


ACGATATTTT 


TGTAAATTCA 


CTGATTCAAG 


TATTTTAAGT 


CAATATGAGG 


AGGGATGTTA 


2880 




TGAGCGATTC 


TGAGAAAGAA ATTTTAAAAA 


GAATTAAAGA 


TAATCCGTTT 


ATTTCACAAC 


2940 


20 


GTGAACTTGC 


TGAGGCAATT 


GGATTATCTA 


GACCCAGCGT 


AGCAAACATT 


ATTTCAGGAT 


3000 




TAATACAAAA 


GGAATATGTT 


ATGGGAAAGG 


CATATGTTTT 


AAATGAAGAT 


TATCCTATTG 


3060 




TTTGTATTGG 


CGCAGCGAAT 


GTAGATCGTA 


AGTTTTATGT 


GCATAAAAAT 


TTAGTTGCAG 


3120 


25 


AAACATCAAA 


TCCTGTAACG 


TCAACACGCT 


CTATTGGTGG 


CGTAg CAAGA AATATTGCTG 


3180 




AGAACTTAGG 


TAGGCTTGGC 


GAAACGGTCG 


CTTTTTTATC 


TGCTAGTGGA 


CAAGATAGTG 


3240 




AATGGGAAAT 


GATTAAACGA 


TTGTCCACAC 


CATTTATGAA 


TTTGGATCAT 


GTTCAACAAT 


3300 


30 


TTGAAAATGC 


GAGTACAGGT 


TCATATACAG 


CTTTAATTAG 


TAAAGAAGGC 


GACATGACAT 


3360 




ATGG CTTaGC 


AGATATGGAA 


GTGTTTGACT 


ACATTACGCC 


TGAATTTTTA 


ATTAAGCGTT 


3420 




CACACTTATT 


GAAAAAGGCT 


AAGTGCATTA 


TTGTAGATTT 


GAATTTAGGC 


AAAGAGGCAT 


3480 


35 


TAAACTTCTT 


ATGTG CCTAT 


ACCACGAAAC 


ATCAAATCAA 


ATTAGTTATC 


ACCACGGTTT 


3540 




CTTCCCCAAA AATGAAAAAT ATGCCTGATT CATTACATGC TATTGATTGG ATTATCACGA 


3600 




ATAAAGATGA 


AACAGAAACA 


TACTTAAATT 


TAAAAATAGA 


ATCTACTGAT GATTTAAAAA 


3660 


40 


TAGCTGCTAA 


ACG CTGGAAT 


GATTTAGGTG 


TTAAAAATGT 


TATTGTGACA 


AATGGCGTGA 


3720 




AAGAACTCAT 


TTATCGAAGT 


GGTGAGGAAG 


AAATCATTAA 


GTCAGTTATG 


CCAT CAAATA 


3780 


45 


GTGTGAAAGA 


TGTTACAGGT 


GCAGGCGATT 


CATTCTGTGC 


TGCAGTAGTG 


TATAGCTGGT 




TAAATGGGAT 


GTCTACTGAA 


GATATATTAA 


TTGCTGGTAT 


GGTTAACGCA 


AAGAAAACGA 


3900 




TAGAAACGAA 


ATATACAGTT 


AGGCAAAACC 


TAGATCAACA 


GCAACTTTAT 


CACGATATGG 


3960 


50 


AGGATTATAA 


AAATGGCAAA 


TTTACAAAAG 


TATATTGAGT 


ATTCTCGAGA 


AGTTCAGCAA 


4020 




GCACGGGAGA 


ACAATCAACC 


GATTGTAGCA 


TTAGAATCAA 


CAATTATTTC 


GCATGGTATG 


40B0 
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GCCATTCCAG 


\ CAACCATAGC 


: CATTATAGAT 


' GGCAAAATTA AAATTGGTTT 


' AGAAAGCGAA 


4200 




GATTTAGAAA 


. TACTGGCAAC 


' TAGTAAAGAC 


GTTGCTAAAG 


TATCTAGAAG 


GGATTTAGCA 


4260 


5 


GAAGTTATTG 


CGATGAAGTG 


TGTTGGTGCT 


ACTACTGTAG 


CGACGACGAT 


GATATGTGCT 


4320 




GCAATGGCTG 


GTATTCAATT 


TTTTGTTACA 


GGAGGTATTG 


GGGGCGTCCA 


TAAAGGTGCA 


4380 


10 


GAACATACGA 


TGGACATTTC 


AGCAGACTTA 


GAAGAACTGT 


CTAAAAGAAA 


TGTGACTGTT 


4440 


ATCTGTGCAG 


GTGCCAAATC 


AATTTTAGAC 


TTACCTAAGA 


CGATGGAGTA 


TTTAGAAACA 


4500 




AAAGG CGTTC 


CAGTTATTGG 


ATATCAAACG 


AATGAATTGC 


CAGCATTCTT 


CACTCGCGAA 


4560 


15 


AGCGGTGTTA 


AGTTAACAAG 


TTCGGTTGAA 


ACGCCAGAAC 


GACTTGGTGA 


CATTCATTTA 


4620 




ACAAAACAGC 


AGTTAAATCT 


TGAAGGTGGC 


ATTGTTGTTG 


CTAATCCAAT 


TCGATATGAG 


4680 




CATGCCTTAT 


CAAAAGCATA 


TATTGAGGCA 


ATCATAAATG 


AAGCTGTTGT 


TGAAGCGGAA 


4740 


20 


AATCAAGGTA 


TTAAAGGTAA 


GGACGCCACA 


CCGTTCTTGT 


TAGGGAAAAT 


TGTAGAAAAA 


4B00 




ACGAATGGTA 


AAAGTTTAGC 


AGCAAATATA 


AAACTTGTTG 


AAAACAATGC 


GGCGTTGGGT 


4860 




GCTAAAATTG 


CTGTCGCTGT 


TAATAAATTA 


TTGTAGGTGA 


TGATACATGA 


ATATTTTATT 


4920 


25 


CGCTATCACA 


GGGATAGCAT 


TTGCACTATT 


TGTTGCGTTT 


TTATT CAGTT 


TTGATGGTAA 


4980 




AAAAATAGAC 


TTCAAAAAGA 


CGTTAATAAT 


GATATTTATT 


CAAGTGTTGA 


TCGTGTTATT 


5040 




TATGATGAAC 


ACAACGATTG 


GTTTGACAAT 


TTTAACTGCA 


CTAGGTTCAT 


TTTTTGAAGG 


5100 


30 


GCTAATAAAT 


ATTAGTAAAG 


CAGGCATAAA 


TTTTGTTTTT 


GGAGATATAC 


AAAATAAAAA 


5160 




TGG CTTTACG 


TTCTTTTTAA 


ACGTATTACT 


G C CATTAGTT 


TTTATTTCTG 


TATTAATAGG 


5220 


35 


CATCTTTAAT 


TATATTAAGG 


TATTACCATT 


TATTATCAAA 


TATGTAGGTA 


TCGCTATTAA 


5280 


TAAAATAACT 


AGAATGGGGC 


GCTTAGAAAG 


TTATTTTGCT 


ATTTCAACAG 


CAATGTTTGG 


5340 




GCAACCAGAA 


GTATATTTAA 


CAATAAAAGA 


TATTATTCCA 


AGATTATCTA 


GAGCGAAATT 


5400 


40 


ATATACAATT 


GCGACGTCTG 


GTATGAGTGC 


TGTTAGTATG 


GCAATGCTAG 


GTTCATATAT 


5460 




GCAGATGATT 


GAACCCAAGT 


TCGTAGTTAC 


AGCAGTAATG 


TTAAATATTT 


TTAGTGCGCT 


5520 




TATCATCGGC 


AGTGTAATCA 


ATCCCTATAA 


ATCTGATGAT 


ACTGATGTTG 


AAATTGATAA 


5580 


45 


CTTAACGAAA 


TCCACAGAAA 


CTAAAACATT 


GAATGGAAAA 


ACAGGAAAAC 




jo* U 




TGCCTTTTTC 


CAAATGATTG 


GTGATAGTGC 


GATGGATGGG 


TTTAAAATCG 


CTGTTGf AGT 


5700 




AGCCGTAATG 


TTGTTAGCAT 


TTATTTCATT 


AATGGAAGCA 


ATTAATATCA 


TGTTTGGTAG 


5760 


SO 


TGTTGGTTTG 


AACTTTAAAC 


AGCTTATTGG 


CTATGTGTTT 


GCACCAATCG 


CATTCTTAAT 


5820 




GGGGATTCCA 


TGGAGCGAAC 


TGTTCCAGCT 


GGCTCTTTAA 


TGGCGACTAA 


ATTAATTACA 


5880 
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CAAGGTATCA 


TTTCAGTTTA 


CTTAGTAAGC 


TTCGCTAATT 


TTGGTACGGT 


TGGTATCATC 


6000 




GTAGGTTCAA 


TTAAAGGCAT 


TAGTGATAAA 


CAAGGAGAAA 


AAGTTGCATC 


CTTTGCAATG 


6060 


5 


AGGTTGCTAC TTGGTTCAAC TCTAGCTTCA ATCATTTCAG GATCAATCAT TGGCTTAGTA 


6120 




TTGTAAATGA 


ATCGAAGTAC 


CTAAATTAAA 


TTCATGGCAA 


AGCTAAACCC 


CGTCACCAAG 


6180 


10 


TTGGCGCAAC AGCGcATgcA TAACTTAGTG ACGGGGTTTT ATCATAACAA TCTACTTTTT 


6240 


CGTAGCCGTT 


x x x vmvw% x yj x 


*\ lul 1 vsn x Ijvjr 


TTTATCTTTT TCAAAAATTG TTAATCCCGT 


6300 




TATATCTTTT 


TTATGTTTTG 




GAAGCTAAGT 


ATATAAGCAA 


AGACAAAAGC 


6360 


15 


AACTGTAAAT 


GAAATGGTAG 


ATACATAGAA 


AGGTGAGTTA 


CCTTTGCCAA 


CACCATTATA 


6420 




GACATAAGCA 


AAGATGATAC 


C CAAT ATT AA 


TCCACAAATA 


ACACCGAATG 


TATTCGTACG 


6480 




TTTAGTGAAA 


ATACCAACTG 


CAAATACACC 


AGCCAATGGA 


ACGCCGAATA 


ATCCAGTCAC 


6540 


20 


AAACAAGAAT 


AAATCCCATA 


AGTCATTTGA 


ATTAGAAGCA 


ATTAAGTATA 


GTGACATTCC 


6600 




AAAACCGAAA 


ATACCTGCAA 


TGATAATAAT 


GAAACGTGCA 


AAGTTAACTT 


CGTGTCGCTC 


6660 




GCTACCTTTT 


CCGAAGAAGC 


GTTGCTTAAT 


GTCGATTGAA 


ATACAAGCAG 


ATATAGAATT 


6720 


25 


TAAACTAGAT 


GAAATGGTAG 


ACTGTGCAGC 


GGCGAAAATG 


GCTGCAATAA 


GTAATCCTGC 


6780 




TACAAATGGT 


GGCATCTCAG 


TCAAAATGAA 


ATATGGCACT 


ACAGATGATG 


TATTGAAGCC 


6840 




TTTTGGTAAA A CAGCTT CAT 


GTG TATAAAA 


TGAATACAGC 


Al lolALLLA 


I AC CAT AAAA 


6900 


30 


TAAGGGTGCT 


GAAATTAAAG 


CTAGGATACC ATTTGTCCAT 


AAHjAI X 1A1 


TTGTTTCTTT 


6960 




TAAACTATCA 


GAAGCTTGAT 


AACGCTGCAC 


GACGTCTTGA 


r ** i "f"V2/ "I Vi'IV * f 
^- X LuL Ivilul 


M.1 L Vjt\ X HLrtn 


7020 


35 


GTTGTTGAAA 


ATATTT CCTA 


GGAAAATAAT 


TGGAATGGCA 


GCTGCCGCAG 


TATTTAGTTT 


7080 


CCAATTGTCT 


GCACTAATTA 


ATTTTTTGTG 


CTCAATCGCA 


TCTGCAAAGA 


CAGTGCCGAA 


7140 




ACCG&CTTTA ATGTTCACAA 


CACCTAGAAT 


AATAATAACT 


AAAGCGCCGC 


CTAATAAAAT 


7200 


40 


GACGCCTTGA 


ATGAAATCAC 


TCCAAACCAC 


ACCTTCGAAA 


CCACCTAAAA 


ATGTATATAA 


7260 




AATACATAGT 


AAACCAACGA 


GTGATGCAAC 


GATATAAGGG 


TTCATGTCTG 


ATACAGATGT 


7320 




GATTGCTAAT 


GTTGGTAAGT 


AGATAACAAT 


TGCAACACGC 


CCTAAATGGT 


AAACGACAAA 


7380 


45 


TAATAATGAG 


CCAATGACAC 


GTATGCTAGG 


GCCAAATCTA 


GCTTCTAAAT 


ATTCATATGC 


7440 




AGATGTTACC 


TTTAACTTTT 


TAAAGAAAGG 


GACATAGAAA 


TAAATAAGTA 


ATGGAATAAT 


7500 




TGCGACGATA 


GCAATGTTAC 


CAGCGATATA 


TGAC CAATCT 


GTTAAAAATG 


CTTTCTCTGG 


7560 


SO 


TGTCGACATA 


AATGTAATCG 


CACTTAACGT 


AGTAGCATAA 


ATTGAAAAGC 


CAACTACCCA 


7620 




AGATGGCAAG 


CG AC CACTTG 


CGGTAAAGAA 


ACTATTGGTA 


CTTTGGCTCG 


CGCGCTTGGT 


7680 
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20 



25 



30 



35 



40 



45 



SO 



55 



TGTGCCAAAT 


CCAACTTCTT 


TCATGGGCAA 


CATCCCCTTT 


ACAATGTATT 


GATTCTTTGA 


7800 


TGTCTATAAA 


TCGTATTTTG 


CAATGAGTTG 


ATCTAATGTT 


TGTCGATGTG 


CTTCGTTAAA 


7860 


AGGTTTGAAA 


GGTCTTTTCG 


GTAATCCTGC 


ATCAATGCCA 


CGATGACGTA ATATTTCTTT 


7920 


CAATGTTGGA 


TAAATCCCCA 


TTGATAACAC 


TGTTTCGATA 


ATGTCGTTTG 


AATCATGTTG 


7980 


CAGTTGGTAA 


GCTTCTTGAA 


TTTGACCTTG 


TCGTGCTAAG 


TCGAAGATTT 


TTCTTGCACG 


8040 


GCGACCATTA 


ACGTTATATG 


TAGAACCAAT 


TGCACCATCT 


ACGCCAGAAA 


TCGTAGCTTG 


8100 


AACTAACATT 


TCATCAAAGC 


CAGATAAGAT 


TAATTTGTCT 


GGGAATGCTT 


TTCTAATACG 


8160 


TTCGAGTAGG 


AAGAAGTTTG 


GCGCTGTATA 


TTTAACACCA 


ACAATTTTTT 


CATGATTAAA 


8220 


TAGCTCGCTG 


AATTGTTCAA 


TAGAAATATT 


CACACCTGTT 


AAATCTGGTA 


TTGCATAAAT 


8280 


AATCATATTG 


TTCTGAGTTG 


CTTCGATAAT 


ATCGAAATAG 


TAATCTCTAA 


TTTCTTCAAA 


8340 


AGTAAATGGA 


TAGTAGAATG 


GTGTTACGGC 


AGAAAGTGCA 


TCATAACCGA 


GTTCTGTGGC 


8400 


ATATTTTCCA 


AGTTCAATGG 


CTTCATTTAA 


ATCTAACGAA 


CCTACTTGAG 


CAATCAATTT 


8460 


CACTTTATCC 


CCAACTGCCT 


CTTTGGCAAC 


CTTGAAAACT 


TGCTTCTTCT 


GCTCTGTATT 


8520 


TAATAAAAAG 


TTTTCGCCTG 


AGCTAGCATT 


TAGATAAAGA 


CCGTCTAATT 


CTTCAGTTTC 


8580 


AATGGCATTT 


TGAGCAATTT 


GTTTAAGTCC 


TTGTTCATTT 


ACTTGACCAT 


TTTCATCAAA 


8640 


AGGAACGAGT 


AACGCTGCAT 


ATAAACCTTT 


TAAATCTTTG 


TTCATTATGA 


AGTCCCTCCA 


8700 


AAAATCATTT 


GATAATATAG 


TTTACAGCTA 


TAATTGTAAA 


CGCTATCATA 


AAATGTAACA 


8760 


ATATCTTTTT 


GAAAATTGTA 


GTCATATTTA 


TGTATAATTA 


ATGAAAATGT 


TTTTCAAAAT 


8820 


CAATAGAAAT 


GGAGTGAGTA 


AGGTGTATTA 


CATCGCAATC 


GATATTGGAG 


GCACTCAAAT 


8880 


TAAATCGGCA 


GTTATTGATA 


AGCAATTGAA 


TATGTTTGAC 


TATCAACAAA 


TATCAACGCC 


8940 


GGACAACAAA 


AGTGAGCTTA 


TTACTGACAA 


AGTATATGAG 


ATTGTAACAG 


GATATATGAA 


9000 


GCAATATCAG 


TTGATCCAAC 


CTGTCATAGG 


TATTTCATCA 


GCAGGCGTTG 


TTGATGAACA 


9060 


AAAAGGCGAA 


ATTGTATACG 


CAGGGCCAAC 


CATT CCGAAT 


TATAAAGGTA 


CTAATTTTAA 


9120 


GCGATTATTA 


AAATCACTGT 


CTCCTTATGT 


CAAAGTAAAA 


AATGATGTAA 


ACGCTGCATT 


9180 




TTQAAATTAC 


ATCAATATCA 


AGCAGAACGG 


ATCTTTTGTA TGACGCTTGG 


9240 


TACAGGCATT 


GGGGGTGCGT 


ACAAGAATAA 


TCAAGGTCAT 


ATTGATAATG 


GTGAGCTTCA 


9300 


TAAGGCAAAT 


GAAGTTGGGT 


ATTTATTGTA 


TCGTCCAACT 


GAAAATACAA 


CGTTTGAGCA 


9360 


ACGTGCTGCA 


ACGAGTGCAT 


TGAAAAAGCG 


CATGATTGCC 


GGAGGATTTA 


CGAGAAGCAC 


9420 


ACATGTGCCA 


GTATTGTTTG 


AAGCAGCTGA 


AGAAGGTGAT 


GATATTGCAA AACAAATATT 


9480 
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AGGGCTTATA TTAATTGGGG GCGGTATATC TGAACAAGGA GATAATCTCA TTAAATATAT 9600 

CGAGCCGAAA GTTGCACACT ATTTACCAAA AGACTATGTT TATGCACCAA TACAAACGAC 9660 

S . ■ . 

TAAGAGTAAA AATGATGCAG CATTATATGG CTGTTTGCAA TGATAGTTGA AAGAAGGAGT 9720 

CATTCTAAAA TAGAATTTGA AACCGTTACG AGAGATGAGA GCTGTTGTTA GTTCCACACA 9780 

TCACACTCTA TCTAGGACCA ATCTAAACTA TATCAACCAA cAGTGTG CCA CGGGCAAATT 984 0 

10 

AAATTGAAGA AGCTGAGATA TTAAAATTTT AGAAAATGTA AAAAAATATT TGGTATTGAA 9900 

ATTAAAAAAG CACCTAGCAA CTCGTTGGGA CAATCACGAT GATTGTCTAC AGTTGCAGGT 9960 

GGATTTGAAT ATACTACTAG TTATTTGTTG TCTAGGATAA TAGATTTAGT ATGTTGATAA 10020 

GTTTGACTCA GATTCGTATT TTCTAATAAA TGATAACTCA CGATATCGAT TAAAAAGAGT 10080 

GTCG CAATTT GTGTGTTGAT AAATTGATGG TCGGTATTAC GCGATTGATC CGTTGTTAAA 10140 

20 AGTACTAAAT CTGCACAATC TGTAAGTTTA CTACCTTCAA AATTTGTGAT GGCAACGACA 10200 

TATG CACCAT GAGATTTGGC GACTTCCGCT GCAGAAATTA ATTCCGAAGT ATTACCACTA 10260 

TTTGACATAG CAATAAACAT ATCCGAATGA GATAGTAGGG ATGCCGATAT TTTCATTAAA 10320 

25 TGTGAATCGG TAGTAACATT ACCTTTTAGC CCCATACGAA TCATACGATA ATAAAATTCA 10380 

GTCGCTGATA AACCAGAGCT ACCTAGTCCA GCAAAGAGTA TATGTCGACT TGATTGAAGT 1044 0 

TTGTCGATAA AGGTTTGGAT AATGTCGTTA TCAATAAATT CACCAGTTTG TTGAATGATT 10500 

30 ' 

TGTTGATGAT ATTTATGAAT TCTTTGAATA ATTGGGCTAT TTTCAATAAC TGTCTCTGTC 10560 

ATTTCTTGTT GAATATTAAA TTTTAAATCT TGGAAATTCT CATAATCCAG CTTAT GACTA 10620 

AAGCGTGTCA TCGTTGCTGG TGATGTACCA ATCG CATGGG CTAAGGAGTT AATCGTTGAA 1068 0 

35 . 

AAGGCATCGC TATAACCATT TTGTC TT AT A TAATTGACGA TGCGTTTATC AGTTTTTGTA 10740 

AATAAATGTT GATAACGTTG AACACGATTC TCAAATTTCA TTGTGTCACC CCTTCATCTT 10800 
AATGATTACT ATTATATATG AAAAATATTT TCAAGATAGT AAAAAGCATT GATAAAAATT - 10 86 0 

40 

ATCTTAATGA TATATTGTAA ATGAC TTT AC GTGAAAAAAC GACTTATGGA GTGAGGAATA 10920 

ATGTTACCAC ATGGATTAAT AGTATCTTGT CAGGCACTAC CAGATGAACC ATTGCATTCA 10980 

45 TCTTTTATTA TGTCGAAAAT GGCATTAGCT GCGTATGAAG GTGGTGCTGT TGGTATTCGC 11040 

GCAAATACTA AGGAAGACAT TTT AG CAATT AAAGAAACGG T AGATTT AC C AGTTATTGG C 1110 0 

ATTGTGAAAC GTGACTATGA TCACTCAGAT GTTTTCATTA CTGCAACGTC AAAAGAAGTT 11160 

SO GATGAACTGA TAGAAAGCCA ATGTGAAGTC ATTGCATTGG ATGCAACGTT ACAGCAACGT 11220 

CCGAAAGAAA CGTTAGACGA ATTAGTATCA TATATTAGAA CACATGCACC GAACGTTGAA 112 80 
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TATATTGGCA 


CGACGTTACA 


TGGCTATACT 


AGTTATACGC 


AAGGACAATT 


ACTTTATCAA 


11400 




AATGACTTCC 


AATTTTTAAA 


AGATGTACTA 


CAAAGTGTTG 


ATGGAAAAGT 


TATTGCGGAA 


11460 


5 


GGTAATGTCA TTACACCGGA TATGTATAAA 


CGTGTGATGG 


ACTTAGGCGT 


TCATTGTTCA 


11520 




GTCGTTGGTG 


GTGCGATAAC 


ACGACCAAAA 


GAAATTACGA 


AACGTTTTGT 


TCAAATTATG 


115B0 


10 


GAAGATTAAA 


TGATAACGAT 


AAAAAAACGA 


GATGACCATC 


ATTAATTAAA 


GGCACCTAAT 


11640 


TATCTTAGGT 


GGCTGAATGA 


ATGTAATGGG 


TTCATCTCGT 


TTTGTTTGTT 


TATGATAGTG 


11700 




ATTTTATTTT 


CAACTTTATC 


CAAAAATAAG 


TAAAGCGACG 


GGGATGGTGA 


TTAATAGCGA 


11760 


15 


CAAOGCCACG 


CGTAAAAACC 


AAATGATGAT 


GAGTTTCCAG 


ACAGGTATTT 


TAATTTCfAGT 

X *mMm XXX \mjwT»\J X 


11820 




TGCTAGTATA 


CATGGCACTA ATGCTGAGAA 


AAAGATAATG 


GCTGATACGC 


x x fiv i/iv«r>vw 


11880 




GACGACAAAT 


TTAGTACTCA 


TTGCAGCTTT 


AGTTACTAAP 


AAAGATGGTA 


GAAAPATPTT' 


x x. u 


20 


TACAATAGAA 


AckCTGACGC 


TTTTGCTAGT 


AAAGCCTGAT 


vAv^Anl X wVJ 


GAAAATATAA 
u/Uv-AX AX An. 


i-UUU 




ATAAATGGAT 


AGAAGATATA 


GCCAAGCCAA 


TCAATGAATG 


GTGTATAGTT 

u X vj X X X X 


\— .VJ v_ A rt\__rvrt X V— 


X .£ UO U 




AGTCCTAAAA 


AACCAATCGA 


TAATATAGAA 


GGTAAAATAP 
w x aaaa x rt^v_ 


fAAPAGTHAT 


■i"i'r ^l^i R i 




25 


TCTTTCAAAT 


TGTGCCAAAC 


GTTCTTCACG . 


AGAGATGGTG 


x x _r_^_. x vjt \__n x x 


X XV_rX X lWiiC 


l^lOU 




GCCTCTGCAT 


ATGCAGTTTT 


CAGTCTGCTT 


CCTTCAATAG 


f AA( _ i*"i"t " i ~ I "(TI 

WWv* X X x> X X w 


X X V- X V_.v» 1 X \_ 1 


X> U 




TGTCCGTTAT 


AATATTCTGT 


TG A TT O A T*TY5 

X >J*V X X V— X X V— » 


CTGATTGGCG 


GTAGC C!ATGC 


rt.vj X Art X X ULn 




30 


GTCACGACAA 


ATGTGATGAC 


TAAAGTTATC 


CAAAAGTATA 


AATTC CAATG 


TfifiPATTAAT 

V— vjVjV«~rt X IaaX 






CCTAAAGTTT 


TAGCAACGAT 


AATCATAAAA 


GTTGCTGAAA 


CTGTTGAAAA 


G CC AGTCG CA 


12420 




ATAATCGTGG 


CTTCTCGTTT 


GTTGTACATC 


CCTTG CTTAT 


AGACAGGATT 


AGT AAT CAAT 


1248 0 


35 


AATCCTAAGG 


AATAACTGCC 


GACAAACGAA 


GCCACTGCAT 


CGACAGeGGA 


TTTTCCTGGT 

x x x x vj x 


12540 




GTTOAAAAA 


TAGGTCTCAT 


AATAGGCTCC 


ATATAAACAC 


CGACAAATTC 


TAATAAGCCA 


12600 


40 


TAGCCCACTA 


ATAAAGAAAG 


cGcAATTGCA 


CCTACTGGAA 


TTAAGATACT 


TAATGG CATC 


12660 


ATTAATTTTT 


CAAACAAAAA 


CGGACCATAG 


TTAGCTTTAA 


ATAGTATTGA 


TGGACCGATT 


12720 




TTAAATACAT 


ACATTATACC 


GATCATTGCA 


CCTGCAACTT 


TAAATAATGT 


AATGAC CAAG 


12780 


45 


TTTGTGATTG 


AAGTCATAAA 


AGTAGGTGTC 


ACTATTGGTA 


ACGCTGTACC 


AATTAAAATC 


12840 




ATAATCAGTG 


CAACATAGGG 


CATAAGTGGA 


CCTATGATTG 


AGCGAATGGC 


TAGATGAACA 


12900 




TGATCGACGA 


AAATAGTGTT 


GTTAGCATTA 


ATCGTAAAAG 


GAATAAAGAA 


ACATAGTATG 


129S0 


50 


CCCACTAAAC 


TATAGACAAA 


AAAACGCCAT 


GCACTTGGTT 


GTTGTGCATT 


AGAATGATAT 


13020 




TGATTCATTA 


AAGCAACCCC 


TTTGTTTAAA 


TGAATACACA 


AAACTGTATG 


ATGCATCTTC 


13080 
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ATAGTTTGAA TTATTTTCAT ACCAATACAA ATTAACTAAT TATATATAGA TTGAAACTAT 13200 

ATTACTTAAT AAAATATTTA TCTTAAATGT TGTTGTGTTG ATTCAACACC ACAACTAAAA 13260 

GTGTTTATAA ATTATTTGGA AATACACATA TTTGTAAATG ATTAGTATCG ATTTAATATC 13320 

GTATTATTAA ATTTTTATTA ATTTTGTAGT CTTAATCmAA AAATAATATA TGTCATGTTA 133 8.0 

TATTGAAGGT GCAGTTGTTT TTCATTCTCA AGAGGGGGTC AAAAAAATAC TTTTGAGGTG 13440 

ATTATATGTT AAGAGGACAA GAAGAAAGAA AGTATAGTAT TAGAAAGTAT TCAATAGGCG 13500 

TGGTGTCAGT GTTAGCGGCT ACAATGTTTG TTGTGTCATC ACATGAAGCA CAAGC CTCGG 13560 

AAAAAACATC AACTAATGCA GCGGCACAAA AAGAAACACT AAATCAACCG GGAGAACAAG 13620 

GGAATGCGAT AACGTCACAT CAAATGCAGT CAGGAAAGCA ATTAGACGAT ATGCATAAAG 13680 

AGAATGGTAA AAGTGGAACA GTGACAGAAG GTAAAGATAC . GCTTCAATCA TCGAAGCATC 13740 

AATCAACACA AAATAGTAAA ACAATCAGAA CGCAAAATGA TAATCAAGTA AAGCAAGATT 13800 

CTGAACGACA AGGTTCTAAA CAGTCACACC AAAATAATGC GACTAATAAT ACTGAACGTC 13860 

AAAATGATCA GGTTCAAAAT ACCCATCATG CTGAACGTAA TGGATCACAA TCGACAACGT 13920 

CACAATCGAA TGATGTTGAT AAATCACAAC CATCCATTCC GGCACAAAAG GTAATACCCA 13980 

ATCATGATAA AGCAGCACCA ACTTCAACTA CACC CC CX3TG T AATG AT AAA ACTGCACCTA . 14040 

AATCAACAAA AGCACAAGAT GCAACCACGG ACAAACATGC > AAATCAACAA GATACACATC . 1410 0 

AACCTGCGCA TCAAATCAT A GATGGAAAGC AAGATGATAG TGTTCGCCAA AGTGAACAGA , 14160 

AACCACAAGT TGGCGATTTA AGTAAACATA TCGATGGTCA AAATTCCCCA GAGAAACCGA 14220 

CAGATAAAAA TACTGATaAT AAACAACTAA TCAAAGATGC , GCTTCAAGCG . CCTAAAACAC ' : 14280, 

GTTCGACTAC AAATGCAGCA GCAGATGCTA AAAAGGTTCG- ACCACTTAAA .GCGAATCAAG -14340 

TACAACCACT TAACAAATAT CCAGTTGTTT TTGTACATGG ATTTTTAGGA TTAGTAGGCG 144 00 

ATAATGCACC TGCTTTATAT CCAAATTATT GGGGTGGAAA TAAATTTAAA GTTATCGAAG 14460 

AATTGAGAAA GCAAGGCTAT AATGTACATC AAGCAAGTGT AAGTGCATTT GGTAGTAACT 14520 

ATGATCGCGC TGTAGAACTT TATTATTACA TTAAAGGTGG TCGCGTAGAT TATGGCGCAG 14580 

CACATGCAGC TAAATACGGA CATGAGCGCT ATGGTAAGAC TTATAAAGGA ATCATGCCTA 1464 0 

ATTGGGAACC TGGTAAAAAG GTACATCTTG TAGGGCATAG . TATGGGTGGT CAAACAATTC 14 700 

GTTTAATGGA AGAGTTTTTA AGAAATGGTA ACAAAGAAGA AATTGCCTAT CATAAAGCGC 14760 

ATGGTGGAGA AATATCACCA TTATTCACTG GTGGTCATAA CAATATGGTT GCATCAATCA 14820 

CAACATTAGC AACAGCACAT AATGGTTCAC AAGCAGCTGA TAAGTTTGGA AATACAGAAG 14880 
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ATTTAGGATT AACGCAATGG GGCTTTAAAC AATTACCAAA TGAGAGTTAC ATTGACTATA 15000 

TAAAACGCGT TAGTAAAAGC AAAATTTGGA CATCAGACGA CAATGCTGCC TATGATTTAA 15060 

CGTTAGATGG CTCTGCAAAA TTGAAGAACA TGACAAGTAT GAATCCTAAT ATTACGTATA 15120 

- CGACTTATAC AGGTGTATCA TCTCATACTG GTCCATTAGG TTATGAAAAT CCTGATTTAG * 15180 

GTACATTTTT CTTAATGGCT ACAAGGAGTA GAATTATTGG TCATGATGCA AGAGAAGAAT 15240 

.GGCGTAAAAA TGATGGTGTC GTACCAGTGA TTTCGTCATT ACATCCGTCC AATCAAC CAT 15300 

TTGTTAATGT TACGAATGAT GAACCTGCCA CACGCAGAGG TATCTGGCAA GTTAAACCAA 15360 

TCATACAAGG ATGGGATCAT GTCGATTTTA TCGGTGTGGA CTTCCTGGAT TTCAAACGTA 15420 

AAGGTGCAGA ACTTGCCAAC TTCTATACAG GTATTATAAA TGACTTGTTG CGTGTTGAAG 15480 

CGACTGAAAG TAAAGGAACA CAATTGAAAG CAAGTTAAAT TCATCTTCTG AATTTAATAT 15540 

20 GCTATGTAAA TCGTGCTGTT ATCATGGCAC AT CAGAT AT A AGTAGCATCA CAGTGTTGAA 15600 

TTTAAAAATA GTAAAGTGAA ATAAAGCGCC TGTCTCATTA GCGAAAACTA AAGGGACAGG 15660 

CGTATCTGTT TATGAGCTTA ATAAATTGTA TGAATAATAT GGTTGATCGA ATAACTGTTT 15720 

25 ATCATGATGA TAAATTGAGT TTTTTAAAAT AATGATATAT TACATCATTG TTATAGCGTT 15780- 

TAAGAAATCA ACAACTTTAC GATAAATAGT GATTGCTTCG TCATTAGGTC TACGATCAAA 15840 

ATCATGCTCG TTTTTATTCA CGCGTTCAAA TGTTGAATGT GGAACATGAT TGATGATATG 15900 

TTCGGTTTCC TCAACGGGAA CATCATAATC GCCATTACAA* TGCGCAATGA AAACAGGTGG . 15960 " 

AAGTGTTTTA AGTTCATCTG GTGCAATATT ATATTTTGAA TTAGTATAAT CAGCAATGTT 16020 

AATCATATTT ATCCATTTAC CTGTGCCACG TGCATAAACG TAGATTAAAA AACGTTGTGC 160 BO 

GATTTGATCT TGAACAACCG GTGTTGGTGA AGTGAGTTGT GCAATCATTG TTTCGTTTAC 16140 

GCTTTGAGCT ATTTTTGCGT AATAACTATT AGTTGTTTTA AAAGGTTCAG TGTTGATGCG 16200- 

ACTATAACCA TAAAAATCAA TAACACCATC AATATCTCTG TCTCGTGCAA TTAATAGACT 16260 

TAAATATGCA CCTGATGATC TGCCAAAGGT AAAAATAGGG CAATTAGAAT ATTGTGATTG 16320 

AATCGCATCG AATGAtGCgn AGnACATCCT CAATAATGCA ATCGAGACTT ACTTCTGGTA 1638 0 

ATAAACGATA ACTTAGTTGA ATTAAATCGT AATGTTCCGT AAgATATCGA TATACTGTGG 16440 

GGATAAATCG TTAGCTTTAC CGAACATTAA TCCACCACCG TGGATGTAGA CAATAGCGCC 16500 

TTTTGTTGGT TGATTTTTTG CTTTAATAAT TGTGTAAGGT AATGCAAATG CATCTTTAGT 16560 

SO AATTACTTTA TCTTTAATTT CAGTCACGAT TTAATAGGCT CCTTATTTTT GATATTGATG 16620 

TCATTATAAC ACTGTCTTAA ATTTCCATGA AAAATAGTCT TAAGACGATG AGTCATGATA 16680 
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CATCATTTTA 


ACAATATCTT 


TAAAAGCAGC 


ATGTGGAATG 


GCTAAATCTT 


CTAAATCTGC 


16800 




CATAGAAAAT 


TCAAGATTGA 


TATCATGTGG 


TCGCTGTTCA 


GCAAGTTTAT 


GCACAAAGTC 


16860 


5 


AGGTTCTGTG 


ACAAAAGGCG 


AAGACATGCC 


GAGCATATCT 


GCATGTTGTA AAGCATCTAA 


16920 




AGCAGACTCT 


GGAGAATTAA 


TCCCGCCACT 


TGCAATTAAA 


GGGATACGAC 


CTGCTAAATG 


16980 


10 


TTCATAGACA 


ATTTGGTTAA 


CTGGTCGACC 


GAAATGATCA 


CCTGGTGTAC 


GAGACGTATT 


17040 


TTGATAAATA TGTCGACCCC AGCTAGCGAT TGCTAAGTAT TGGATGTTTG AAACGTCCAT 


17100 




GACCCAATTG 


ATTAATTGGT 


TGAACTCGTC 


AATGGTATAT 


CCTAAATCAC 


TGCCTCTGGT 


17160 


1S 


TTCTTCTGGC 


GTTGCTCGAA 


ATCCTAAAAT 


AAAATTGTCA 


GGTGCTTCTT 


TATCAATCAC 


17220 




TTCTTGTACC 


GCACGCATAA 


CTTCTAAACA 


TAATCTTGCA 


CGATrrrrxA 


ATGAGTCGGC 


17280 




ACCGTAATGG 


TCTGTACGTT 


TATTCGAAAA 


AGTTGAGAAA 


AATGTTTGAA 


TCAGCAAACG 


17340 


20 


TTGTGCAATC 


GAAATTTCCA 


CACCATCAAA 


ACCTGCTTTA 


ATCGCGCGTA ATGTAGCATC 


17400 




GCGATACTGC 


TGAATGATGC 


TATTGATTTT 


CTCATGAGAC 


ATGGCGATAA 


CATCGTGTTC 


17460 




AATCGGTGAA 


TGCAATGTCA 


TAGGGCTTGG 


TCCATACACC 


TTTCCAAAAT 


TTAAAATGGC 


17520 


25 


TTGATTTGAA AAACGACCAG CATGCGCTAg 


CTGGATAATA 


GCGAGGCTAC 


CATGTTGTTT 


17580 




CATCGTAGAT GCCATGTTAG 


TTAATCCAGG 


GATACAAGCA 


TCATGATCAA TATTAAAGCC 


17640 




ATATTCAAAC 


AATTGACCAT 


AAGGTTCAAT 


GTAAGCAGCG CCGGTGACTT 


GCATTCCAGC 


17700 


30 


TGAATTAGAG 


CGACGTGCAG 


CAT AAG C CAA 


GTCTTCTTTT 


GTAATATAGC 


CTTCTTTTGT 


17760 




TGATGTGTTT ACGGTCATTG 


GTGATAATAC 


AAAGCGATTC 


GAAATTTTGA 


TGCCATTAGG 


17820 


35 


TAAGTGGATT 


G ATTGT AAAA 


GTGGTTTGTA 


TCGGTACATA 


CTATGATTCC " 


TTTTCTATTC 


17880 


AATATTGTTT 


TCAAAGTACC 


ATGGAAAGAA TGAATAATCA 


ATGATGAACA GTCTTGATAG ' 


17940 




AATAGAATTG 


GTACATGGAA 


AGTATTTTTA 


AAATTAAACT 


AATGAATGGC 


ATTTGTAGGT 


18000 


40 


CTGAAAATAT 


GAATATGAAA 


AAGAAAAATA 


AAGGCGAAAA 


GATATAAAAG 


TTAATTGAAA 


18060 




AACGTTATCA 


TATACGTGGG 


TATATGAAGA 


GGGAATGGTA 


TTAAGAACGC 


TAAAATGTTA 


18120 




TGTCGGTTTG 


ACATGACAGG 


ATAAGTTTGG 


AGATGACGGA 


TTGGTTAAAT 


TAAGCGTATT 


18180 


45 


AGACTATGCC 


TTAATAGATG 


AAGGTAAGGA 


TGCACAAAAG 


GCATTGCAAG 


ATTCAGTGAC 


18240 




ACTTGCAAAA 


TTAGCAGATC 


GACTTGGCTT 


TAAGCGAATT 


TGGTTTACGG 


AACATCATAA 


18300 




TGTACCAGCG 


TTTGCGTGTA 


GTAGTCCAGA 


ACTTTTGATG 


ATGCATACAT 


TGGCGCAGAC 


18360 


50 


AAATCACATA 


CGAGTTGGCT 


CTGGTGGTGT 


GATGCTGCCG 


CACTATCGAC 


CTTATAAAAT 


18420 




TGCTGAGCAT 


TTTAGAATGA 


TGGCAGCGTT 


ATATCCAAAT 


CGTATTGATT 


TAGGTATTGG 


18480 
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TAGTTACGAT GAATCGATTT CGTTATTACG 
TGCGCATACG TTAGGTGTCC AACCACACAT 
5 TAGTAGCGCA ACATCTGCCA AAATAGCTGC 

ATTTTTGCTA CCAGATATAA ATGCGATACA 
AAAACATTTC CAAGCATCAA CGATTAAAAT 

10 

CATTGTAGCT GATAACGAAG CGGAAGTAGC 
ATTAGGTAAA TTACAATTTG CAGAATTTGA 
GTATAAGCTT AATGATCGAG ACAAAGAGAT 

1S 

AGGTACACAA GAAAAGGTTA AAGCACAATT 
TGAGGTGTTA GTAGCACCGC TTATTCCAGG 

2Q ACTCGCGGAA ATTTATTTGT AGCATTTTAA 

AAAGTTAGCC AATTATTTAT GGGTAGAAAA 
ACCTGAATTG CAAGATGATA TTGGGACAGT 

25 AGTTAAAGTG GATGATGAAA TTGTGAGTAT 

AACGCCATTG TCAGGAACGA TTATTGAGCG 
TTTAAACTCT GAAAAACCAG AAGAAAATTG 

30 AGCATTCCTA GCATTAC CGG AGGCTTAAAT 

TGAATATTTA ATCAATGATA TGCATCGAGA 
ATCTTCATTT GAAGATTTGT GGGAATTATA 

55 ACCTGTAAGT GATGAATATT TAGCTGTACA 

ACATGTTACG GATTTGAAGG ATTTGAAGCC 
AGGTGATATC ACGACGTTAA AAATCGATGC 

40 

AGGATGTATG CAAGCTAATC ATGACTGCAT 
TCAAGTTCGA CTTGATTGTG CAGAGATCAT 
TAAAGCCAAA ATAACACGTG GATATAATTT 

45 

TCCGCAAATA CGTCGATTGC CTGTTTCAAA 
TCTTAGCTGT CTTAAATTGG CTGATCAACA 
so ATCTACAGGT GTATTTGCTT TTCCTCAAGA 

AGAAAGCTAT CTCAAAGAAA CAAATTCAAC 



TGATTATCTT ACAATAAAGG ATAAACCAAG 18600 

TGATCATTTT CCAGAAATGT GGTTATTAAG 18660 

CGAACTAGGT ATAGGGCTTT CTGTTGGAAC 18720 

TACAGCGAAG GATAACATTG ATATTTACAA 18780 

GGACGCAAAG GTGATGGCAT CTGTATTTGT 1884 0 

AGCATTACAA CATGCCTTAG ATGTTTGGTT 18900 

AGATTTTCCT TCAGTAGACA CAGCACAAAA 18960 

GATTCAAGCA CATCAAGCAC GCATCATTGC 19020 

AGATGATTTC ATTGCTACGT TTGAAGTTGA 19080 

TATTGAACAG CGTTGTAAAA CATTAAAATT 19140 

ATAGAAGAGA AAGGATGAAG ATAAGATGAA 19200 

AGTAGGAGAT TTGTATGTGT TTAGTATGAC 19260 

AGGTTATGTT GAATTCGTAA GTCCAGATGA 19320 

CGAAGCATCG AAAACGGTCA TTGATGTGCA 19380 

AAATACAAAA GCGGAAGAAG AACCGACAAT 19440 

GTTGTTCAAA TTGGATGATG TCGATAAAGA 19500 

GGAAACGTTA AAATCAAATA AAGCGAGACT 19560 

GAGAAATGAC AATGAGGTAT TGGTAATGCC 19620 

TCGAGGCTTA GCAAATGTCA GACCGGCATT 19680 

AGATGCTATG TTAAGTGATT TGAATCGTCA 19740 

GATAAAAGGT GACAATATCT TTGTTTGGCA 19800 

TATTGTTAAT GCTGCAAATA GTCGTTTTCT 19860 

TGATAATATT ATTCATACAA AAGCGGGTGT 19920 

TCGACAACAA GGGCGCAATG AAGGTGTAGG 19980 

GCCAGCAAAG T AT AT AATT C ATACGGTTGG 20040 

GATGAATCAG GACTTGTTAG CTAAATGTTA 20100 

TAGTTTAAAT CATGT CGCTT TTTGCTGTAT 20160 

TGAAGCAGCA GAAATTGCTG TTCGAACAGT 2022 0 

ATTGAAAGTC GTGTTCAATG TATTTACAGA 20280 
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CAATGTCTCT GTTAATGGAT GACAAGACAA AGCAGGCTGA AGTATTGCGT ACTGCGATTG 20400 

ATGAAGCAGA TGCGATAGTG ATTGGAATTG GTGCAGGCAT GTCTGCATCT GACGGATTTA 20460 

CATATGTAGG AGAGCGTTTT ACGGAAAATT TCCCAGATTT TATTGAAAAA TATCGCTTCT 20520 

TTGATATGTT GCAAGCGAGT TTACATCCTT ATGGCAGTTG GCAAGAGTAT TGGGCATTTG 2 05 BO 

AGAGTCGTTT TATTACATTA AACTATTTAG ATCAACCTGT AGGTCAGTCT TACCTCGCTT 20640 

TAAAATCCTT GGTGGAAGGT AAACAGTACC ACATTATAAC TACGAATGCA GATAATGCTT 20700 

TCGATGTAGC TGATTATGAT ATGACTCATG TATTTCATAT ACAAGGGGAG TATATACTGC 20760 

AACAGTGTAG cTCAGCATTG TCATGCTCAA ACGTATCGCA ATGATG ATTT AATTCGTAAA 20820 

ATGGTTGTTG CGCAACAAGA TATGCTTATA CCTTGGGAGA TGATTCCAAG ATGTCCAAAA 20880 

TGTGATGCCC CAATGGAAGT GAATAAACGT AAAGCGGAAG TTGGGATGGT TGAAGATGCT 20940 

GAATTTCATG CGCAACTACA TCGTTATAAT GCTTTTCTAG AGCAACATCA AGATGATAAA 21000 

GTGTTGTATT TGGAAATTGG AATTGGTTAT ACTACACCAC AATTTGTGAA GCATCCTTTT 21060 

CAGCGTATGA CACGTAAAAA TGAAAATGCC CTTTATATGA CGATGAATAA AAAGGCATAT 21120 

CGCATTCCGA ATTCAATTCA AGAACGTACC ATACATTTAA CTGAGGATAT CTCAACATTG 21180 

ATT AC AG CAG CACTCCGGAA CGACAGCACA ACGAAAAATA ACAACATTGG AGAGACAGAA 21240 

GATGTACTTA ATAGAACCGA TTAGAAATGG AGAATATATT ACTGATGGTG CGATTGCACT ' 21300 

CGCTATGCAA GTTTATGTTA ACCAGCATAT CTTTTTAGAT GAAGATATTT. TATTCCCTTA 21360 

TTATTGTGAT CCAAAAGTGG AAATTGGACG TTTTCAAAAT AGTGCTATAG AAGTGAATCA 21420 

AGATTATATA GATAAACACA GT ATT CAAGT AGTTCGCCGA GATACTGGTG GTGGCGCTGT 214 80 

GTATGTTGAT AAAGGTGCCG TTAATATGTG TTGTATTTTA GAACAAGACA CTTCAATTTA- .2154 0 

TGGTSATTTT CAACGATTTT ATCAACCAGC TATAAAGGCG TTGCATACAT TAGGTGCAAC 21600 

AGATGTGGTA CAAAGCGGTA GAAATGATTT AACATTGAAT GGTAAAAAAG TGTCAGGCGC 21660 

CGCAATGACA TTAATGAATA ATCGTATTTA TGGCGGTTAT TCGCTATTAC TTGATGTTAA 21720 

TTATGAAGCA ATGGATAAAG TGTTAAAGCC TAATCGCAAA AAGATTGCAT CGAAAGGGAT 21780. 

TAAATCTGTG CGCGCACGTG TTGGT CAT CT TAGAGAAGCA CTGGATGAAA AGTATCGTGA 21840 

TATAACCATT GAAGAATTTA AAAATTTAAT GGTGACGCAG ATTTTGGGAA TCGATGACAT 21900 

TAAAGAGGCG AAACGATATG AATTAACGGA TGCAGATTGG GAAGCGATTG ATGAATTAGC 21960 

TGATAAAAAG TATAAAAATT GGGATTGGAA TTATGGCAAG TCACCCAAAT ATGAATACAA 22020 

TCGAAGTGAA AGATTATCTT CAGGTACGGT AGACATAACA ATTTCTGTTG AACAAAATCG 22080 
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AGAAGCATTA CAAGGAACAA AAATGACAAG AGAAGATTTA ACGCATCAGT TAAAGCAATT 22200 

AGACATCGTT TATTATTTTG GCAATGTTAC GGTAGAAGCA TTAGTGGATA TGATTTTAAG 22260 

TTAATATTGT TATTTTATGT ATGCTGAATC ATTGGAAGTG TTTGCTTGCT CTTGAAAAGG 22320 

TGACAATAGT GTTTGGTGAA GGTTGAACAT ATGAGTGGAA ATTATTGCCT TTAACTATTC 22380 

. AAAGTATGAT ATATATATGG TTTTTGTTTC TAAATGATTG GGTATTTGAA AATAGATGAG 22440 

TTTAATATTT TAAGGAATAT AATGATGTTT ACTTTTATAA TTCATATAGA ATATTAAGCA 22500 

ATATAAGTCT GTTGATATAT ACAAAATATA ATGACTGCTA TAATGAGTAA TCAATAGACA 22560 

CAAAGAGGAG ATTATGTGAT GAATAATAAA GTATTAGTAA CCGGTGGTAC AGGGTTTGTT 22620 

GGCATGCGAA TTATTTCACG ATTATTAGAA CAAGGTTATG ACGTACAAAC GACGATACGT 22680 

GATTTAAGTA AAGCTGATAA AGTAATTAAA ACAATGCAAG ACAATGGCAT TTCCACAGAG 22740 

CGATTAATGT TTGTCGAAGC GGATTTATCA CAAGATGAAC ATTGGGATGA AGCAATGAAA 22800 

GATTGCAAGT ATGTCTTGAG TGTAGCATCT CCGGTGTTTT TCGGTAAAAC AGACGATGCA 22860 

GAAGTGATGG CGAaCTGcAA TTGAAGGTAT ACAACGTATT TTAAGAGCTG CAGAACATGC 22920 

GGGTGTTAAA CGTGTGGTAA TGACTGCAAA CTTTGGTGCA GTTGGTTTTA GTAATAAAGA 22980 •* 

TAAAAATTCA ATCACAAATG AAAGTCATTG GACAAATGAA GATGAACCAG GCTTATCAGT 23040 

ATATGAAAAA TCAAAATTGT TAG CTGAAAA GGCAGCGTGG GATTTTGTTG AGAATGAAAA 23X00 ^ 

TACAACAGTA GAATTTG CCA CAATCAATCC AGTTGCAATT TTTGGGCCAT CATTAGATGC 23160 

ACACGTTTCA GGAAGCTTTC ATTTATTAGA AAATTTATTG AATGGTTCAA TGAAACGTGT 23220 - 

ACCGCAAATT CCGTTAAATG TTGTTGATGT GAGAGACGTA GCTGAACTGC ACATTTTGGC 23260 

AATGAGAAAT GAACAAGCTA ATGGCAAGCG ATTTATTGCG ACGGCTGATG GACmAATTwA 23340 - 

tTTGTTGGGA ATTGc CAAA t TAATTAAAGA AAAGGGCCTG GAAATAGCTC CAAAAGTTCC 23400 - 

TACTAAAAAA TTACCCAGCT TTATTTTGAG CnAnGnGCC 2343 9 
(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4522 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



50 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

CCCTTTGAGA GTATATCATC TAGTCAAATT ATGCCTGTCA TTAGAGCGAC TAGCTTTGAT 60 
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TATTATGCAG TCGATTTAGG GAAATCATAT CGTCTAATTG ACGAAAGCAT GTTAGAGGAT 180 

TTGAAGTTAA CTGAACAACA AATAAGAGAA ATGTCTCTGT TTAATGTTAG AAAATTGTCA 240 

5 AATTCATATA CGACTGATGA AGTAAAAGGT AATATTTTTT ATTTTATTAA CTCAAATGAC 3 00 

GGGTATGATG CAAGTAGGAT ACTAAATACT GCATTTTTAA ATGAAATTGA GGCACAATGT 3 60 

CAAGGCGAAA TGCTCGTAGC AGTGCCACAC CAAGATGTGT TAATTATTGC AGATATACGC 420 

10 

AATAAAACAG GATATGATGT GATGGCACAT TTAACAATGG AATTTTTCAC TAAAGGTCTA 4 80 

GTTCCAATTA CATCATTATC CTTTGGATAT AAACAGGGTC ATCTTGAACC GATATTTATT 540 

TTAGGTAAAA ATAATAAACA AAAAAGAGAT CCAAACGTGA TTCAGCGTTT AGAAGCAAAT 600 

15 

CGTCGTAAAT TTAATAAAGA TAAATAGAAA TAATTGGATA AGGAGTTTTG TCATAATGAA 660 

TTTATTTTAC AATCCTAAAT ATGTAGGAGA TGTCGCATTT TTACAAATTG AACCAGTTGA 720 

2Q AGGTGAATTA AACTACAATA AAAAAGGTAA TGTTGTTGAA ATTACtAATG AAGGTAATGT 780 

TGTAGGTTAT AATATTTTTG AAATTTCAAA AGATATAACA ATTGAAGAAA AAGGTCATAT 840 

TAAATTAACT GATGAACTTG TAAATGTATT CCAAAAGCGT ATTTCAGAAG CTGGTTTTGA 900 

25 TTATAAATTA AATGCTGATC TATCACCGAA ATTTGTAGTT GGCTACGTTG AAACTAAAGA 960 

CAAACATCCT GATGCAGATA AATTAAGTGT ACTAAATGTA AACGTTGGAA ATGACACATT 1020 

ACAAATTGTA TGTGGCGCGC CTAACGTTGA AGCTGGACAG AAAGTTGTTG TTGCTAAAGT 108 0 

30 AGGTGCAGTG ATGCCTAGCG GTATGGTAAT TAAAGATGCT GAATTACGTG GTGTTGCCTC 114 0 

AAGCGGTATG ATTTGTTCAA TGAAAGAATT GAATTTACCT AATGCACCTG AAGAAAAAGG 12 00 

TATTATGGTA TTAAATGACA GCTATGAAAT TGGACAAGCA TTtTTTGAAT AATTAAGGAA 1260 

35 GGTAGTGAAA AT ATG AG CTG GTTTGATAAA TTATTCGGCG AAGATAATGA TTCAAATGAT 1320 

GACTTGATTC ATAGAAAGAA AAAAAGACGT CAAGAATCAC AAAATATAGA TrACGATCAT 1380 

GACTCATTAC TGCCTCAAAA TAATGATATT TATAGTCGTC CGAGGGGAAA ATTCCGTTTT 1440 

40 

CCTATGAGCG TAG CTT ATGA AAATGAAAAT GTTGAACAAT CTGCAGATAC TATTTCAGAT 1500 

GAAAAAGAAC AATACCATCG AGACTATCGC AAACAAAGCC ACGATTCTCG TTCACAAAAA 1560 

CGACATCGCC GTAGAAGAAA TCAAACAACT GAAGAACAAA ATTATAGTGA ACAACGTGGG 1620 

45 

AATTCTAAAA TATCACAGCA AAGTATAAAA TATAAAGATC ATTCACATTA CCATACGAAT 1680 

AAGCCAGGTA CATATGTTTC TGCAATTAAT GGTATTGAGA AGGAAACGCA CAAGCGAAAA 174 0 

so ACACATAATA TGTATTCTAA TAATACAAAT CATCGTGCTA AAGATTCAAC TCCAGATTAT 1800 

CACAAAGAAA GTTTCAAGAC TTCAGAGGTA CCGTCAGCTA TTTTTGGCAC AATGAAACCT 1860 
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AAACAAAAAT 


ATGATAAATA 


TGTAGCTAAG 


ACGCAAACGT 


CTCAAAATAA 


ACAATTAGAA 


1980 




CAAGAAAAAC 


AAAATGATAG 


TGTTGTCAAA 


CAAGGAACTG 


CATCTAAATC 


ATCTGATGAA 


2040 


5 


AATGTAT CAT 


CAACAACAAA 


ATCAATGCCT 


AATTATTCAA 


AAGTTGATAA 


TACTATCAAA 


2100 




ATTGAAAATA 


TTTATGCTTC 


ACAAATTGTT 


GAAGAAATTA 


GACGTGAACG 


AGAACGTAAA 


2160 




GTGCTTCAAA 


AGCGTCGATT 


TAAAAAAGCG 


TTGCAACAAA 


AGCGTGAAGA 


ACATAAAAAC 


2220 


10 


GAAGAGCAAG 


ATGCAATAGA 


ACGTGCAATT 


GATGAAATGT ATGGTAAACA AGcGGAACgC 


2280 




TATGTTGGTG 


ATAGTTCATT 


AAATGATGAT 


AGTGACTTAA 


CAGATAATAG 


TACAGATGCT 


2340 


15 


AGTCAGCTTC 


ATACAAATGG 


CATAGAGAAT 


GAAACTGTAT 


CAAATGATGA 


AAATAAACAA 


2400 


GCGTCAATAC 


AAAATGAAGA 


CACTAATGAC 


ACTCATGTAG 


ATGAAAGTCC 


ATACAATTAT 


2460 




GAGGAAGTTA 


GTTTGAaTCA 


AGTATCGACA 


ACAAAACAAT 


TGTCAGATGA 


TGAAGTTACG 


2520 


20 


GTTTCGAATG 


TAACGTCTCA 


ACATCAATCA GCACTACAAC ATAACGTTGA AGTAAATGAT 


2580 


AAAGATGAAC 


TAAAAAATCA 


ATCCAGATTA 


ATTGCTGATT 


CAGAAGAAGA 


TGGAGCAACG 


2640 




aATAAAGAAG 


AATATTCAGk 


AAGTCAAATC 


GATGATGCAG 


AATTTTATGA 


ATTAAATGAT 


2700 


25 


ACAGAAGTAG 


ATGAGGATAC 


TACTtCAAAT 


ATCGAAGATA ATACCAATAG 


AAACGCGTCT 


2760 




GAAATGCATG 


TAGACGCTCC 


TAAAACGCAA 


GAGTACGCAG 


TAACTGAATC 


TCAAGTAAAT 


2820 




AATATCGATA 


AAACGGTTGA 


TAATGAAATT 


GAATTAGCAC 


CGCGTCATAA 


AAAAGATGAC 


2880 


30 


CAAACAAACT 


TAAGTGTCAA 


CTCATTGAAA 


ACGAATGATG 


TGAATGATAA 


TCATGTTGTG 


2940 




GAAGATTCAA 


GCATGAATGA 


AATAGAAAAG 


AATAACGCAG 


AAATTACAGA 


AAATGTGCAA 


3000 




AACGAAGCAG 


CTGAAAGTGA 


ACAAAATGTC 


GAAGAGAAAA 


CTATTGAAAA 


CGTAAATCCA 


3060 


35 


AAGAAACAGA 


CTGAAAAGGT 


TTCAACTTTA 


AGTAAAAGAC 


CATTTAATGT 


TGTCATGACG 


3120 




CCATCTGATA 


AAAAGCGTAT 


GATGGAT CGT 


AAAAAGCATT 


CAAAAGTCAA 


TGTGCCTGAA 


3180 




TTAAAGCCTG 


TACAAAGTAA 


G CAAGCTGTG 


AGTGAAAGAA 


TGCCTGCGAG 


TCAAGCCACA 


3240 


40 


CCATCATCAA 


GATCTGATTC 


ACAAGAGTCA 


AATACAAATG 


CATATAAAAC 


AAATAATATG 


3300 




ACATGAAACA 


ATGTTG a G AA 


CAATCAACTT 


ATTGGTCATG 


CAGAAACAGA 


AAATGATTAT 


3360 


45 


CAAAATGCAC 


AACAATATTC 


AGAGCAGAAA 


CCTTCTGTTG 


aTTCAACTCA 


AACGGAAATA 


3420 


TTTGAAGAAA 


GTCAAGATGA 


TAATCAATTG 


G AAAATG AG C 


AAGTTGATCA 


ATCAACTTCG 


3460 




TCTTCAGTTT 


CAGAAGTAAG 


CGACATAACT 


GAAGAAAGCG 


AAGAAACAAC 


ACATCCAAAC 


3540 


SO 


AATACTAGTG 


GACAACAAGA 


TAATGATGAT 


CAACAAAAAG 


ATTTACAGTC 


ATCATTTTCA 


3600 




AATAAAAATG 


AAGATACAGC 


TAATGAAAAT 


AGACCTCGGA 


CGAACCAACA 


AGATGTTGCA 


3660 
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AGAACCACAA GTTATTGAGT CGGACGAGGA CTGGATTACA 3780 

TGACGCATTA TTTTACTTTA ATGTACCTGC AGAAGTACAA 3840 

TGTTACAAGA TTTGAATTAT CAGTTGAAAA AGGTGTTAAA 3900 

ACAAGATGAC ATTAAAATGG CATTGGCAGC GAAAGATATT 3960 

AGGAACTAGT CGTGTTGGTA TTGAAGTTCC GAACCAAAAT 4020 

TTCTATTATT GAATCTCCaA GTTTTAAAAA TGCTGAATCT 4080 

GTATAGAATT AATAATGAAC CATTACTTAT GGATATTGCT 4140 

TGCAGGTGCA ACTGGATCAG GGAAATCAGT TTGTATCAAT 4200 

ATATAAAAAT CATCCTGAGG AATTAAGATT ATTACTTATC 4260 

AGCTCCTTAT AATGGTTTGC CACATTTAGT TGCACCGGTA 4320 

TACACAGAGT TTAAAATGGG CCGTAGAAGA AATGGAACGA 4380 

TTACCCATGT ACGTAnTATA ACAGCATTTA ACnAAAAAGC 4440 

CAAAAATTGT CATTGTAaTT GATGAGTTGG CTGATTTAAT 4500 

TG 4 522 

(2) INFORMATION FOR SEQ ID NO: 40: " . 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 751 base pairs 
30 (B) TYPE: nucleic acid 

( , ,■ (C) STRANDEDNESS : double ^ 

(D) TOPOLOGY: linear 

35 (xi) SEQUENCE. DESCRIPTION: SEQ ID NO: 40: 

TCAAGTTTAC GGATACGTAT ATATTTTGCA TGACATTTAG TGCAATAATA TTCATAATTT 60 

GCCCGTTGTT GATAGCTTTC AATGCTGTTA CAAAATCTAG GCGCTCCAAC CTGTTGGCTC 120 

40 - 

AATCGTTTAA AATCTTGATC TTTATGTTGA TAACCTTTAC CAGCAATATG CAAGTGATAA - 180 

TGACACAATT CGTG CAGTAT AATTTTTACA ACAGCATCTT CTCCATAATG CTCATATTGT 240 

TTTGGATTAA TTTCAATATC ATGGGACTTT AAAAGATAAC GTCCGCCTGT TGTACGTAAC 3 00 

45 

CTTTTATTAA AATATGCACA ATGTCGAAAC GTACGTCCAA ATTTTTCTTC CGAAAGATTC 3 60 

TCAAC CATTC GCTGAAGTTT GTCATTATTC ATGTGGATCA ATCATCGTTA ATGATACTTT 420 

GTCTTTATTT TTGTCAATAC TGTAAATCCA AACGTCAACG ATATCACCAA CACTGACAAT 4 80 

SO 

ATC CATTGGA TTTTTTACGA ACTTCTTAGA AAGTTTCGAA ACATGGACAA GTCCATCTTG 540 

55 



CCAAGTGTTT CATTACTAGA 
GATAAAAAGA AAGAACTGAA 

5 GATGTAACTG AAGGTCCAAG 

GTTTCAAGAA TTACGGCATT 
CGTATAGAAG CGCCTATTCC 

10 CCAACGACAG TCAACTTACG 
AAATTAACAG TTGCGATGGG 
AAAACGCCAC ACG CACTAAT 
AGTATTTTGA TGTCTTTACT 
GATCCAAAAA TGGTTGAATT 
ATTACAGATG TCAAAGCAGC 

20 

CGTTATAAGT TATTTGCACA 
CCCATATGAT GAAAGAATGn 
oc GATGATGGTC CGCAAGAAGT 
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TTTCATTCCT TCTTGTAAAf CTTCAATTGA TAGCACATCG GATTTAAGGA TTGGTGTTTC 660 
AAACTCGTCC CTTGGATCTC GATTAGGTGC GTTCAAGGAT TTAATAATAT CCTCTAATGT 720 

5 

AGGTACACCG ACTTGTAATT CAATCGCCAG T 751 

(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 
io (A) LENGTH: 1076 base pairs 

(B) TYPE: nucleic acid 
CO STRANDEDNESS : double 
(D) TOPOLOGY: linear 

15 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 
TCTCCAGCTT TAACTTGATC TGGCACTTTA ACAATTGTCT GATCCATACA TACG CGACCA 60 

20 ATAACTTCGC ATTGATGACC ATTTACATTT ACAAAGCTAC CTTGCATTAT GCGTAAATGG 120 

CCATCTGCAT ATCCAATAgG TAACAATGCT ATTGTAGTTG GGTCAGTAGC TGTATAAGTT 180 
GCACCATAAC TTACAGACTC ACCCGCTTGT AGCGTCTTTG TTTGAACTAC ATTAGCAATT 24 0 

25 AATTGCACAC TTGGTTTAAG GTGTACTTTA ACTTTTTGCT GTACATACTC TGATGGATAA 30 0 

TATCCATAAA GGGAAATTCC TGGTCTTATT GCATTACAGA ATTGGCAATC CATTAAT AG A 3 60 

GAGCCTGCTG AGTTCTGACA ATGTATATAT TCAGGTTTAA TTGCTTCATT GA C CAT AT CT 420 

30 TTAAAACGTT GATATTGTTC AGTTGTCATA TCTCCTGGTT CGTCAGCACA GGCAAAGTGT 4 BO 

GTAAACACGC CTTCAAATAC AAGTTGCTCA TATTGTTGAA TGATTTCAAT CACTTCTTGA 54 0 

TACGTTTTAG TATCTTTAAT ACCTAAACGT CCCATTCCTG TATCTAATTT AATGTGCAAC 600 

55 CATAACTTTT TCTCTTGCTC ACCAGAAATG TTTTTAATTG CTTCTTTCAA CCACTGTTTA 6 60 

GACGGAACCG TTAAGGCAAC TCGGTGTTGT ATCGCTTTAT CAATATCTTT AGCTGGTAAC 72 0 

ACACCTAAGA CTAAAATTTT AGCAGTAATC CCATGCATTC TAAGTTCTAT CGCTTCATCT 78 0 

40 

AACGTTGCTA CAGCAAAAAA TGTGGCGCCA TTTTCCATTA AATGACGTGC TACTTTAACA 84 0 

CTACCTAGTC CAT AGG CATT GGCTTTAACG ACAGCCATCA CTGTTTTATT TGGATGCAAT 900 
GTACTGAATA CTTTGAAATT TGATG CAACA GCGTTTAAAT CTAGATTCAT ATACGCAGAT 960 

45 

CTATAATATT TATCCGACAT ATTACTTCCT CCTGTAATTC CCACACGTTT TAAAACTAGA 1020 

TCTTAATTAT CATTGTATAA CAAATTTAAA ATGCTGACTT TTCTAAAACA ACTTGG 1076 

£0 (2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2930 base pairs 
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10 



20 



<C) STRANDEDNESS : double 
(D> TOPOLOGY: linear 



Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

TGACCACAAT GCCCAATACA ACCATCCCAT GGTAAAGCCA AGAGATGAGT CAATAAAGCG 60 

TGTTGAATAA GAGCTGAATG AACCTGATAC TGGATAAAAT GTTGCCAACT CTCCAATTGA 120 

TGACATTAAG AAATATAGCA TGACACCAAT AACAAGATAA GCGAGTATAG CGCCTCCAGG 180 

ACCAGCTTGA GAAATGATAT TACCAGTAGC TACAAATAGA CCAGTCCCAA TTGCACCACC 240 

TATAGCAATC ATGGAAATGT GTCTTGAGTT AAGACTACGG TTCATTTTAT TATCTTCCAT 300 

ATTTAGTCTC CCATCTATTT AAATATACCC ATTATTGTAA GCTTTTTAAG TGTACTATTC 360 

AATAACTATT TAGTACTGTA AAGCGAAAAA ATTAAAATTT TCTGATTTTT TAATCATCTT 420 

GAGCATGTTT AATTGTAATT TTGATGGGGT TAAATTATAA TATGTATTAA ATTATAATTA 480 

TnATAAATTG TGGAGGGaTG ACTATGTCAC AACAAGACAA AAAGTTAACT GGTGTTTTTG 540 

GGCATCCAGT ATCAGACCGA GAAAATAGTA TGACAGCAGG GCCTAGGGGA CCTCTTTTAA 600 

TGCAAGATAT TTACTTTTTA GAGCAAATGT CTCAATTTGA TAGAGAAGTA ATACCAGAAC 660 

GTCGAATGCA TGCCAAAGGT TCTGGTGCAT TTGGGACATT TACTGTAACT AAAGATATAA 720 

CAAAATATAC GAATGCTAAA AtATTCTCTG AAATAGGTAA GCAAACCGAA ATGTTTGCCC 780 

GTTTCTCTAC TGTAGCAGGA GAACGTGGTG CTGCTGATGC GGAcGTGACA TTCGAGGATT 840 

TGCGTTAAAG TTCTACACTG AAGAAGGGAA CTGGGaTTTA GTAGGGAATA ACACACCaGT 900 

ATTCTTCTTT AGAGATCCAA AGTTATTTGT TAGTTTAAAT CGTGCGGTGA" AACGAGATCC 960 

TAGAACAAAT ATGAGAGATG CACAAAATAA CTGGGATTTC TGGaCGGGTt TCCAGAAGCA 1020 

TTGCACCAAG TAACGATCTT AATGTCAGAT AGAGGGATTC CTAAAGATTT ACGTCATATG 1080 

CATGGGTTCG GTTCTCACAC ATACTCTATG TATAATGATT CTGGTGAACG TGTTTGGGTT 114 0 

AAATTCCATT TTAGAACGCA ACAAGGTATT GAAAACTTAA CTGATGAAGA AGCTGCTGAA 1200 

ATT AT AG CTA CAGATCGTGA TTCATCTCAA CGCGATTTAT TCGAAGCCAT TGAAAAAGGT 1260 

G ATT AT C CAA AATGGACAAT GT AT ATT CAA GTAATGACTG AGGAACAAGC TAAAAACCAT 1320 

AAAGATAATC CATTTGATTT AACAAAAGTA TGGTATCACG ATGAGTATCC TCTAATTGAA 13 80 

GTTGGAGAGT TTGAATTAAA TAGAAATCCA GATAATTACT TTATGGATGT TGAACAAGCT 1440 

SO GCGTTTGCAC CAACTAATAT TATTCCAGGA TTAGATTTTT CTCCAGACAA AATGCTGCAA 1500 

GGGCGTTTAT TCTCATATGG CGATGCGCAA AGATATCGAT TAGGAGTTAA TCATTGGCAG 1560 
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GGTCAAATGC GCGTAGTTGA CAATAACCAA GGTGGAGGAA CACATTATTA TCCAAATAAC 1680 

CATGGTAAAT TTGATTCTCA AC CTG AAT AT AAAAAGCCAC CATTCCCAAC TGATGGATAC 1740 

5 GGCTATGAAT ATAATCAACG TCAAGATGAT GATAATTATT TTGAACAACC AGGTAAATTG 1800 

TTTAGATTAC AATCAGAGGA CGCTAAAGAA AGAATTTTTA CAAATACAGC AAATGCAATG 1860 

GAAGGCGTAA CGGATGATGT TAAACGACGT CATATTCGTC ATTGTTACAA AGCTGACCCA 1920 

10 GAATATGGTA AAGGTGTTGC AAAAGCATTA GGTATTGATA TAAATTCTAT TGATCTTGAA 1980 

ACTGAAAATG ATGAAACATA CGAAAACTTT GAAAAATAAA TTTGATATGT AGTTTCTATA 2040 

TTG CGTAGTT GAGCAGTTTA TGATATCATA ATAAATCGTA AAGATTCCTA ACAAGAGAGG 2100 

15 

GTGTTTAACG TGCGCGTAAA CGTAACATTA GCATGCACAG AATGTGGCGA TCGTAACTAT 2160 

ATCACTACTA AAAATAAACG TAATAATCCT GAGCGTATTG AAATGAAAAA ATATTGCCCA 2220 

AGATTAAACA AATATACGTT ACAT CGTGAA ACTAAGTAAT TCTTATCATT CAAATACGAC 2280 

20 

GATTTGAAAA TAAAGCGGGC TTACCTATTA TATTGGGGAG CTCGCTTTTT TATGAAATTT 2340 

TTGTGAAGAG TGATTAATGG ATTGAGTTTC ATCGGTAGAA CAATATATGA TTATATTAGT 24 00 

TGTTACTTTA TTAAAaTTTG AGAATATTTA TAGAAGGAAA TAGATTACTG ATTTTATAAA 24 60 

25 

GTCACTTTGT TAGCGAATGC TTGAAAGAGT ATTTAATATA GTAGAATTTA AAATTTCAAA 2520 

GCGGAATTTA ATAAGTACGA AGTAGTTCTG GGTATGTTTT ATAAATGTTC GATAATACAC 2580 

TTTAATCTTA AATATGATGG TTTAGAAAAT GATTTAACAA AGAAATGAaA CTTTACTGTT 2640 

30 

GAATTATGTG AGGATTGTGT TATTATATAA ATCGTAATAA TTACGATTTG ATAAAAAGTG 2700 

AGGTAACTAT ATATGGCTAA GAAATGTAAA ATAGCAAAAG AG AG AAAAAG AGAAGAGTTA 2760 

3S GTAAATAAAT ATTACGAATT ACGTAAAGAG TTAAAAGCAA AAGGTGATTA CGAAGCGTTA 2820 

AGAAAATTAC CAAGAGATTC ATCACCTACA CGTTTAACTA GAAGATGTAA AGTAACTGGA 28 BO 

AGACCTAGAG GTGTATTACG TAAATTTGAA ATGTCTCGTA TTGCGTTTAG 2930 

40 (2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3606 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
45 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 

50 

CTTCTTGCCA TGGCTCTCTT TATTTAAAAA TGCTTCCAAC TTGTCCATTT GATTGTTTCT 60 
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TTATAAAAAA CTAATTTTAC AAATGCTTTT GCGTTCTTAC AAAAAATGCA TTTGACTATT 180 
ATTATAATAA GCGTATAATT GTCGCATATT ATTTTTTGTA TTTTTGGCAA TAACGAAGGA , 240 

5 GTATTTATGA ATAAAGACAA GCAATTGCAC AACGAGAAAA TCAATCTATC CCAATTAGTC 300 

TTATTAGGGT TAGGCTCTTT AATAGGATCT GGTTGGCTAT TTGGTGCGTG GGAAGCATCA 360 
TCAATAGCTG GACCAGCAGC AATCATATCA TGGGTTCTTG GATTCCTAGT CATTGGAACC 420 

10 ATTGCCTATA ACTACATTGA AATCGGCACA ATGTTTCCTC AATCAGGTGG CATGAGTAAC 4 80 

TATGCCCAGT ATACACATGG CTCATTATTA GGCTTTATTG CTGCTTGGGC GAATTGGGTG 540 
TCTTTGGTGA CAATAATACC TATCGAAGCT GTGTCAGCTG TTCAATATAT GAGTTCTTGG 600 

15 

CCGTGGCATT GGGCGAAACC AATGAGATAT TTAATGGAAA ATGGCTCTAT TAGCACATAC . 660 
GGATTGCTAG CTGTATATCT CATCATTGTT ATTTTTTCAT TATT AAACTA TTGGTCCGTA 720 

AAACTTTTAA CATCATTTAC GAGTTTAATT TCTGTATTTA AATTAGGCGT ACCCATGTTA 780 

20 " - 

ACCATCATCA TGTTGATGCT ATCAGGATTC GACACTTCAA ATTACGGCCA TTCGGCAAGC 840 

ACATTTATGC CTTACGGAAG TGCACCGATT TTTGCTGCAA CAACAGCATG AGGGATTATT 900 

TTTTCATTCA ATTCATT CCA GACAATTATT AATATGGGTT CAGAAATTAA AAATCCTGAA 960 

25 

AAAAATATCG CAAGAGGCAT CGCTATCTCA CTGTCAATCA GTGCAGTGTT . GTACATCATT 1020 

TTACAAAGTA CGTTTAT CAC TTCTATGCCT CAATCAATGT TACAACATAG TGGATGGAAT' 1080 

GGCATCAACT TCAATTCACC ATTTGCTGAT, TTAGCTATCT TATTAGGAAT TAATTGGCTC 1140 

30 

GCAATTTTAC TATACATTGA AGCTTTTGTA TCACCATTCG GTACTGGCGT GTCATTTGTC 12 00 

GCCGTTAGAG GTCGAGTTTT ACGAGCAATG GAGAAAAATG GACATATCCC TAAATTTCTT 126 0 

3S GGGAAGATGA ATGAAAAATA TCATATCCCA CGTGTAGCAA TCATCTTTAA TGCCAT CATT 1320 

AGT£TGATTA TGGTTACATT ATTTAGAGAT TGGGGTACGC TAGCAGCAGT TATTTCTACT ' 13 80 

GCAACTTTAG TAGCCTATTT AAGTGGCCCA ACGACAGTGA TTGCATTAAG AAAAATGGGA 1440 

40 CCAACAATGA CTCGTCCATT TAGAGCAAAA ATTTTAAAAG TAATGGCACC ATTATCATTT 1500 

GTATTAGCTT CATTAGCTAT ATATTGGGCA ATGTGGCCAA CAACGGCTGA AGTTATTTTA 15 60 

ATCATTATAC TTGGATTACC AATCTACTTC TTCTATGAAT AT CGTATGAA TTGGCGTAAT 1620 

45 ACAAAGAAAC AAATTGGTGG TAGCTTATGG ATTATTGTAT ATTTAATCGT GCTATCAATA 1680 

CTGTCATTTA TAGGAAGCAA AGAATTTAAA GGCTTAAATA TGATTCACTA TCCATTTGAC 1740 

TTTATCGTTA TTATTATTGT GGCACTTATC TTCTATTACA TCGGTACAAC GAGTTCATTT 1800 

50 GAAAGCGTCT ATTTCCGTCG CGCAACACGA AT CAAT ACG A AGATGCGTGA GTCACTAAAT 1860 
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CACACACATT 


AACCAACCAT 


TGATTTCAAC 


ATCTTGGTTG 


GTTTTTTATT 


TTGAAAATCG 


1980 




GTTATAAATA ACTAACATAA 


CAAGATGATG 


ATCAGGCTGG 


GACATAAATC 


AATGTTCTAT 


2040 


5 


GCTCTACGAA 


gTTATATTGG 


CAGTAGTTGA 


CTGAACGAAA 


ATGCX3CTTGT 


AACAAGCTTT 


2100 




TTTCGATTCT AGTCAGGGGC 


CCCAACACAG 


AGAATTTCGA 


AAAGAAATTC 


TACAGGCAAT 


2160 




GCAAGTTGGO 


GTGGGACGAC 


GATAAAGAAA 


TACTTTTTCT ATAGAAATTA GTATytCTTA 


2220 


10 


TGCATGAGTT 


TTACTCATGT 


ATTCATATTT 


TTAAGTACAC 


ATTAGCTGTG 


GCTAATGTAT 


2280 




AAGAACCACT 


ACATAATAAA 


TCATTTGTGG 


CTCTTTATCA 


TTTCTGTCCC 


ACTCCCGTAG 


2340 




AAGTACATCA 


TATAATGCTG 


AAAATGGTTT 


GAGTTAAAAC 


AGATATCAAG 


CTCGTCTGAT 


2400 


15 


TCAGTCACAA 


AATTGTCTTG 


TTATACTTGT 


CACCTATCAT 


CTATAGACCG 


TGGTATGATT 


2460 




AAATTGGGGA 


TGATAAAGGA 


GGTTAATAAA 


TATGAAGATT 


AATACTACAG 


GTGGTCAAAT 


2520 


20 


TCATGGTATT 


ACACAAGATG GTTTAGATAT 


CTTCTTAGGC 


ATTCCTTATG 


CAGAACCACC 


2580 


AGTTCATGAC 


AATCG CTTTA 


AACATTCTAC 


GTTAAAAACA 


CAATGGTCAG 


AGCCAATTGA 


2640 




TGCAACTGAA 


ATACAACCCA 


TCCCACCGCA 


ACCAGACAAC 


AAATTAGAAG 


ATTTTTTCTC 


2700 


25 


CTCACAATCT 


ACAACTTTTA 


CTGAACATGA 


AGACTGTTTA TATCTAAATA 


TTTGGAAACA 


2760 


ACATAATGAT 


CAGACGAAGA 


AACCTGTCAT 


CATTTATTTT 


TATGGTGGTA 


GTTTTGAAAA 


2820 




TGGTCATGGT 


ACAGCCGAAC 


TCTATCAACG 


GGCACATTTA 


GTACAAAATA 


ACGACATTAT 


2880 


30 


CGTTATTACA 


TGCAATTATC 


GTTTAGGCGC 


ATtAGGATAT 


TTAGACTGGT 


CATATTTTAA 


294 0 


TAAAGATTTT 


CATTCCAATA 


ATGGCCTTTC 


AGATCAAATC 


AATGT CAT AA 


AATGGGTGCA 


3000 




TCAATTTATT 


GAATCCTTCG 


GTGGCGACGC 


TAATAACATT 


ACTTTAATGG • 


GTCAGTCTGC 


3060 


35 


AGGGAGTATG 


AGCATTTTGA 


CTTTACTTAA 


AATAC CTGAC 


ATTGAGCCAT 


ACTTCCATAA 


3120 




AGTQGTTCTA 


CTAAGTGGCG 


CACTACGATT 


AGACACCCTT 


GAGAGTGCAC 


G CAATAAAGC 


3180 




ACAACATTTC 


CAAAAAATGA 


TGCTCGATTA 


TTTAGATACA 


GATGATGTTA 


CATCATTATC 


3240 


40 


GACAAATGAT 


ATTCTTATGC 


TGATGGCGAA 


gcTAAAACAA TCTCGAGGAC 


CTTGTAAAGG 


3 3 00 




GCTTGATTTA 


ATATATGCGC 


CTATTAAAAC 


AGATTATATA 


CAAAATAATT 


ATCCAACAAC 


3360 




GAAACCAATT 


TTTGCATGTT 


ATACAAAAGA 


TGAAGGCGAT 


ATTTATATTA 


CTAGTGAACA 


3420 


45 


GAAAAAATTA 


TCGCCGCAAC 


GCTTTATCGA 


CATTATGGAA 


TTAAATGATA 


TTCCTTTAAA 


3480 




ATACGAAGAT 


GTTCAGACGG 


CGAAGcAACA 


ATCTTTAGCG 


ATTACACATT 


GTTATTTCaA 


3540 




ACAGCCGATG 


aAGCAATTTT 


TACmACmACT 


CAATATACmA 


GATTGCAACC 


GCACCAACTA 


3 600 


50 


TGGCTT 
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(i) SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 15109 base pairs 
<B> TYPE: nucleic acid 
<C) STRAND EDNESS : double 
<D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 44: 

GAAATTAAAA AAGCAATTGG nACAAGATGC AACAGTGTCA TTGTTTGATG AATTTGATAA 60 

AAAATTATAC ACTTACGGCG ATAACTGGGG TCGTGGTGGA GAAGTATTAT ATCAAGCATT 120 

TGGTTTGAAA ATGCAACsAG AACAACAAAA GTTAACTGCA AAAGCAGGTT GGGCTGAAGT 180 

GAAACAAGAA GAAATTGAAA AATATGCTGG TGATTACATT GTGAGTACAA GTGAAGGTAA 24 0 

ACCTACACCA GGATACGAAT, CAACAAACAT GTGGaAGAAT TTGAAAGCTA CTAAAGAAGG 300 

ACATATTGTT AAAGTTGATG CTGGTACATA CTGGTACAAC GATCCTTATA CATTAGATTT 360 

CATGCGTAAA GATTTAAAAG AmAAATTAAT TAAAGCTGCA AAATAATTCA G CTATATAAG 420 

TTAGTGAAAT GAGAGTCTGA AACATATCAA TCTTTTGATA TTGTATTAGG CTCTTATTTT 480 

TATAGCTAGA AAGTTAGATA TTTGTATTTT • TTTAAATAAT AAGTGCCGTT GTTATCGTTC 54 0 

AATTTAATTA ATGATAGATT AGTATTATTA TAGCTAAAGT AGTATACCTG AGAAAATAGC- 600 
TCAATGTATC TCTTTATTAA TAAGTTATAT CATAATTATT TTAGTGCATA CTTTATGGAA - 660 

GGGATATCAG GGAATGGCTT TCAATTAAAG AAG AGGTTTA AAAGGATTAC AACAGAATGT 720 

TATGATTTTG TAGAAAGATA TATAACAACG TTTTATAAAA ACATAATATT GTTAATGGAA 780 

AATGAAATGT AAGGGGGATT TCGAGTGACT AAGAAAGTTT ATTTTAACCA CGATGGTGGT 840 

GTAGATGATT TAGTATCTCT ATTTTTATTA TTACAAATGG AAAACGTTCA ATTGATAGGG 900 

GTCSGTACAA TTGGTGCTGA TTGTTATTTA GAGCCATCTT TGAGCGCATC AGTAAAAATT 960 

ATTAATCGTT TTTCAAATGA AGATATTCAA GTTGCGCCAT CATATGAACG AGGAAAAAAT 1020 

CCATTTCCTA AAGAATGGCG TATGCATGCC TTTTTTATGG ACG CATTGCC AATTTTAAAT 108 0 

GAG CCAGTCA AACATGTTGC TTCAAATGTG AGCGACAAAG AAGCCTTTGA AGACATTATT 1140 

CAAACTTTAA AGAGACAATC AGAAAAAGTA ACATTATTAT TTACAGGCCC GCTTACAGAT 1200 

45 TTAGCAAAAG CACTACAAAA AGATTCATCT ATCGTTCAGT ATATAGAAAA ATTAGTTTGG 1260 

ATGGGTGGCA CCTTTTTACC AAAAGGAAAT GTTGAAGAAC CTGAGCATGA TGGTTCTGCA 1320 

GAATGGAATG CATATTGGGA TCCAGAAGCG GTTAAAATTG TTTTTGATAG CGATATAGAG ■ 1380 

SO ATTGATATGG TTGCTTTAGA AAGTACGAAT CAAGTACCGC TAACGTTAGA TGTTAGACAA 1440 
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GTACCACCAT 


TAACACACTT 


TATAACAAAT 


TCTACTTACT 


TTTTATGGGA 


TGTTTTAACG 


1560 




ACTGCTTATA 


TTGGTAACAA 


GGACTTGGTT 


CATTCAATTG 


AGAAAAAAGT 


CGATGTAATA 


1620 


5 


AGTTATGGAC 


CAAGTCAAGG 


TAAGACATTT 


GAGTGTAAAG 


ATGGGCGCAA 


AATTAATGTC 


1680 




ATAAATCATG 


TAGATAACAA 


CX3CATTTTTT 


GATTATATAA 


CTGCACTTGC 


TAAAAAAGTA 


1740 




AATTAACAGC 


TGTGTAGAAT 


AATTAAGGTT 


TTAATTTATA 


TAGAACAACT 


TATTGTAAAC 


1800 


70 


TTTTCATTTC 


TTAAAGTTTA 


CAATGGTGCT 


ATAATAATGG 


TCATGAAATA 


CGAAAGGAAG 


1860 




TAAAAAATGA 


CAACAAAACA 


GTTAGTATAT 


ACAGCTTTAA 


TGACAGCGAT 


TATCX3CTATT 


1920 




TTAGGATTGG 


TACCGGTAAT 


TCCACTACCA 


TTTTCTTCAG 


TACCAATTGT 


ACTTCAAAAC 


1980 


15 


ATTGGTATTT 


TCTTAGCAGG 


TGCGATTTTA 


GGACGTAAAT 


ATGGCACATT 


AAGTGTTATC 


2040 




GTCTTTTTAT 


TATTAGTAGT 


TGCTGGCTTG 


CCATTGTTAT 


CAGGTGGTCG 


CGGTGGCATC 


2100 




GGTGTATTCG 


CAGGTCCTTC 


AG CAG GGTTT 


TTACTATTAT 


ATCCAGTTGT 


AGCATTCATG 


2160 


20 


ATTGQGGCGA 


TTCGAGATAG 


ATTCATCAAT 


GAAATTAATT 


TCTGGATTTT 


ATTCGTTGGT 


2220 




ATTTTAGTTT 


TTGGTGTTAT 


AGCATTAGAT 


GTTATTGGTA 


CATTGATTAT 


GGGCATGATT 


2280 


25 


ATTAACATAC 


CATTTACGAA 


AG CTATTTCA 


ATTTCATTAG 


CTTATTTGCC 


TGGTGATATA 


2340 


TTAAAAGCAA 


TTGTAG CAAG TTTGATTGGT 


ACAGCTTTAC 


TTAATCACTC 


GCAGTTTCGT 


2400 




CAAATTATGG 


GAATAAAATA 


ATCATATTTA 


AGATAGTAAA 


GTAATTGAAT 


AAGTTGCTTT 


2460 


30 


GAAATTTATA 


AAAGTGAAAG 


GAGTAGGTGT 


CAATGGCTAG 


TATAAGTATG 


TCAGATATAT 


2520 


ATTGTAACGG 


CACTATATTT 


GAAAATGACG 


ACGAGCAGTT 


GATTTATTTA 


ACGCCTTCTT 


2580 




TTCCACAACG 


ATACACAAGT 


AACACATGGA 


TATATAAAAA 


GACGCCTACC 


CAAGAGCGAT 


2640 


35 


GGCTGAAAGA 


CTTAGAACGT 


CAACATCAAT 


TACATACAAA 


TCAAGGTTCA 


AATCATTATG 


2700 




CGTTTAGTTT 


CCCGGAAAAT 


GAACAACTTG 


ATAATCATTG 


GATGGCTATG 


TTTAAAGATA 


2760 




TGAATTTTGA 


ACTAGGTATT 


ATGGAATTGT 


ATGCCATAGA 


AAGTGATGCG 


CTTGCCAATT 


2820 


40 


TGCCGCGTAA 


CTCTGACGTT 


GAAATTGCCA 


TCGTTGACGA 


GTCGCATATA 


GATGCCTATT 


2880 




TAAAAGTTGC 


ATATCAGTTT 


AGTTTGCCAT 


TTGGAAAAGA 


CTATGCAGAT 


GCACATGAAG 


2940 






GGAACATTAT 


GAAAAAGATG 


JAiAl 


CTTAGTAGCT 


TATTTAAATA 


3000 


45 


ATGAACCTAT 


TGGCGTTGTA 


GATGTCATTG 


AAAGTGAAAA 


TTACATTGAA 


TTAGATGGAT 


3060 




TTGGTGTATT 


AGAACAATTT 


GGGCACCAAG 


GAATTGGATC 


TACAATTCAA 


TCGTTGATAG 


• 3120 




GTGAATACGC 


CATATCAAAA 


AATCACAAAC 


CAATCATATT 


AGTTG CAGAT 


GGTGAAGATA 


3180 


50 


CAGGAAAAGA 


TATGTATGCA 


AAGCAAGGTT 


ATGTCTATCA 


ATCGTTTTGT 


TATCAAATAT 


3240 
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TAAGCTGGTT 


TCGAGTAGAA ATCAACTTAC 


TRfTTTTTAA 

X VJ \w X X X X X 


X X VJ X X X X Wl 


GCTACTTATA 


3360 




CTTATAAAAA 


TAGTGCGTTT 


AAATTGTTGA 


TTCATGTAGA 


IV T A T PWFTP A 


TTATGACACA 

X X J* X wAwn\«n 




5 


CTATAATGAA 


TATGTTATTG 


TTCAGAATCA 


ATGATACGTT 


PTY5GATGAPT 

W X X Un\> X 


GTAT ATATTTA 


34 80 




AAGCCACCAT 


TTCGAATAAA 


TCCAACTGCC 


f!TAATATTTA 

V A/Wlin* A X^^ 


GGT CATT AGC 


TAAGGTTACA 




10 


GCAAGCGTTG 


TCGGAGCTGA 


TTTAGATAAA 


ATGACGCCAA 


C APPA ATTTT* 


< TY?pflfip*iT , i , A 

X\7V»WW X X X A 


3 O WW 


ATTAAAATTT 


CTGATGAAAT 


ACGTCCACTA 


AAAATTAATA 


P* I'l'M'AT'PTPfZ 

\a« XXX X X* X 


f3 A P A(TT A AT A 
\3nU/vj X X^\ 






TGTCGCTGAA 


TACAAAATCC 


ATATAATTTA 


TCTAGAGCGT 


TATftTfTA pp 


A ATfiT CTTCZT 




15 


CGATGTACAA AAAATGTCAA 


ACCATCGCTT 


ATAnp Af?pAT 

.rt X AVJUlULll X 




T* T " I 'f J* T 1 ^ I ^ 

M.V_V». iol X 1L1 


J /ou 


i 


TGGTAAATAT 


GACTTGCACT 


TTGTAATCGA 


uluAl L_nxl*?l 


1AAIAA1 1 i\jr 


CATTGGAGTT 


1 0 A rt 

3840 




AAAGTGATTT 


TAGACATAGA 


TGTTTTAGCG 


A 1 AtsCAlj\JAl 


CATTTTGAAA 


Al AAAAC x CA 


^ a ft ft 


20 


CGACTCTTTC 


CGCAACAAGA 


TGCAATCATT 


CGTTTTGTGG 


AATATTGAAA 


GCGATCGCCT 


3 960 




AAATCTTTAT 


TAAGTTCAAC 


ATGGGCAAAA 


CCTTTACTAT 


CATCAAT CAG 


TACAGATTTT 


4020 




AATTCATCTC 


GCTTTAAAAT 


GGCACCTTCC 


CxAAGCCAGAA 


AT C CAATGAC 


fr^ 7\ 7\ / u'H/'v "11/1 1\ 

TAACTCCTCA 


4 080 


25 


AGGTTTGTTG 


GACTGCATAT 


AACAGTCGCA 


AATTCTTCAC 


CATTCAC CAT 


AATTGTAAGT 


4 14 0 




GGAAATTCTG 


TCACATATTG 


ATCTGTTGTA 


11 vjAATAAIT 


TTCCATCTTC 


AT AT CTAACA 


4200 




ATTGGTTGAC 


CTAAAGATAC 


ATCTTTGTTC 


ATTATCTAAC 


C CCTTTAATT 


AGCTTAAACT 


4260 


30 


TTATTTTAAA 


GCAATTTGCT 


TAAAATTTTA 




TTAAGTTTGA 


AATTTGATTG 


4320 




ATAAAAATTA 


ATAGCGAGCA 


ATCTGTTTGA 


TTTAAATTGA 


All CtjAtsAAx 


A 1 ALIATACTA 


4380 


35 


GGG CATCAAT 


TAATAAATAT 


CAATCTTATG 


pa a ATTTTSAr 1 

V^vvrl XXX UAL 


AnX lol X lun. 


A T/**»A ATA ""PAT* 


444 0 


AAACAGGCAA 


CGGTTCTTTT 


CAAATATAAT 


AfiTAAfyTfiTA 




blAAAlAl 1 A 


A C ft ft 




TTAftAAATGG 


GGGTTCACTC 


AATGAAATTG 


AAAPRTTTAT 


X X V? X vj X x 0 X 


HATTfirAATfi 
VZtt X X UUnn X \3 - 


*% 3D W 


40 


CTTTTAGTAT 


TAGCTGGTTG 


CTCTAATTCT 


AAPRATAATA 


ATRAAARTAA 


AAAAfSAnrVZAP 


ii ^ "7 n 




GCAGACAATG 


GTAAGAAACA 


AGAGATTCAA 


GTTG cagcgg 


P Af^P A AfVTTT 


AAPAGATYITA 


4. £ a n 
** 0 0 u 




ACCAAGAAAT 


TAGCTTCAGA ATTTAAAAAA 


gagcataaaa 


ATGCTGATAT 


TAAATTTAAC 


474 0 


45 


TATGGTGGAT 


CAGGGGCATT 


AAGAAAACAA 


attgaatcag 


GCGCACCTGT 


TGACGTATTT 


4800 




ATGTCTGCAA 


ATACTAAAGA 


TGTAGATGCA 


TTAAAAGACA 


AGAATAAAGC 


gcatgataga' 


4 860 




TATAAATATG 


CGAAAAATAG 


TCTAGTATTA 


ATTGGTGATA 


AAGATTCAAA 


TTACACTTCA 


4920 


50 


GTAAAAGACT 


TAAAAGACAA 


TGATAAATTA 


GCATTAGGTG 


AAGTGAAAAC 


TGTACCAGCA 


4980 




GGAAAATATG 


CGAAACAGTA 


TTTAGATAAC 


AATAACTTAT 


TTAAAGAAGT 


cgaaagtaaa 


5040 
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CAAGGTTTTG TGTATAAAAC TGACTTATAT AAACAAAATA AAAAAATTGA TACTGTAAAA 5160 

GTAATTAAAG AAGTAGAACT TAAGAAGCCA ATCACATACG AAGCTGGTGC TACATGAGAT 5220 

AGTAAATTAG CAAAAGAGTG GATGGAATTC TTAAAATCAG ATAAAGCTAA AGAAATACTA 5280 

AAAGAATACC ACTTTGCAGC ATAAGGAGTT GTAATCCATG CCTGACTTAA CACCTTTTTG 5340 

GATATCAATA CGAGTTGCTG TAATCAGTAC GATTATTGTA ACGGTTTTAG GTATTTTTAT 5400 

ATCTAAATGG TTGTATCGTC GTAAGGGTTC GTGGGTTAAA GTATTGGAAA GTTTATTGAT 5460 

ATTACCTATT GTTTTGCCGC CAACGGTATT AGGTTTTATT CTATTAATCA TCTTCTCGCC 5520 

AAGAGGACCA ATCGGTCAAT TCTTTGGGAA TGTACTACAT TTACCTGTAG TGTTCACTTT 5580 

GACAGGTGCT GTGATAGCAT CTGTCATTGT TAGTTTTCCA CTAATGTATC AACATACTGT 5640 

GCAAGGCTTC AGAGGTATAG ACACGAAAAT GATTAATACA GCTAGAACGA TGGGAGCAAG 5700 

TGAAACGAAA ATTTTCCTCA AATTAATTTT ACCATTAGCT AAACGCTCTA TTTTAGCAGG 5760 

TATAATGATG AGTTTTGCTC GTGCATTAGG TGAGTTTGGT GCTACATTAA TGGTTGCAGG 5820 

ATATATTCCA AATAAAACGA ATACACTACC TTTAGAAATA TACTTCTTAG TGGAACAAGG 5880 

TAGAGAAAAT GAAGCGTGGT TATGGGTATT AGTGCTAGTC GCATTCTCTA TTGTGGTTAT 5940 

ATCTACAATT AATTTATTGA ATAAAGATAA ATATAAGGAG GTCGACTAGA TGCTTAAAAT 6000 

CAATGTGAAA TATCAATTAA AGAACACTTT AATTCGCATC AATATAGATG ATACTGAACC 6060 

AAAAATTTAT GCAGTTCGTG GTCCATGTGG CATTGGTAAA ACTACTGTTT TAAATATGAT 6120 

TGCCGGATTA CGTAAAGCAG ATGAAGCTAT TATCGAAGTG AATGGGCAAT TACTTACTGA 6180 

TACGGCAAAA AACGTGAATG TTAAAATTGA ACAAGGACGT ATTGGATATC TGTTTCAAGA 6240 

. CTACCAATTG TTTCCTAATA TGACGGTCTA TAAAAATATT ACTTTTATGG CTGAACCATC 6300 

TGAAeACATC GATCAATTAA TTCAAACTTT AAACATTGAT CATTTGATGA AACAATATCC 6360 

TATGACATTG TCAGGTGGAG AGGCACAACG TGTAGCACTT GCACGTGCAG TTAGCAC r AA 6420 

ACCAGATTTA ATTTTATTAG ATGAACCTTT TTCTAGTTTG GATGATACTA CAAAAGATGA 64 80 

GAGTATTACA TTAGTTAAAC GTATTTTCAA CGAATGGCAA ATACCAATCA TATTTGTGAC 654 0 

ACATTCAAAC TATGAAGCAG AACAAATGGC TCATGAAATT ATTACAATTG GGTAATCATT 6600 

TATTTGCCAT TAAAGAGTTT AGAACGTATT TAAAATTGTA GAAGTGAATG CTTCTATCAG 6660 

CATTTTAATG ATGTTTTAAA CTCTTTTTTA GGGGCAGTTT TTTTGAGAGA CATTGACGCG 6720 

CGTCATATAA TGAAAGTAAT GATAAAAAGA AAGGATAACT TAATGTGAGT CAAGAACGTT 6780 

ATTCAAGGCA AATTTTATTT AAACAAATAG GTGAAATAGG TCAAAGCAAA ATAAATCAAA 6 84 0 
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GAGCAGGCAT TGCCAAACTA ATCATTGTTG 
AAAGACAAAC ATTGTTTACT GAAGAAGATG 
5 CTAAAAAGCA TTTGCTAGCG TTACGTAGTG 

TGGATTATTA TTTTTTGGAA ACACATGGAC 
ATAACTTTGA AACACGACAA CTGATTAATG 

10 

TTTATGGTGG TGTTGTACAG AGTACATATA 
CTTGCTTTAA CTGTTTGGTA CCACAATTGC 
GGGTCATTCA ACCTGCCGTG ACGATGGCAA 
TATTAACGGA ACAACCAATT GACACAAAAA 
ATTATTCATT TGGTTTGAGT AAAATGCAAC 

20 CAAGTTATCC GTATTTAAAC AAGAATGAAC 

CTGTACAGTA TGAAAATGCA TCAATTACAC 
ATCAGTTAAA TTATCGCAGT AATTCGTATA 

25 TTGTTGCTTT TAAAGGTGGA AGGTTTTTAA 

CACATCTAAT GAATTTATTG TTTGGATAAA 
TGGGCGAACA TCAAAACGTT AAATTGAATC 

30 

CAGATACTAG AGACTTTGAT ACAGATAAAG 
CAGATGACGT TGAAGTGAGT GACGCACATT 
TCACGACGCA GGTGAAGAAG TGGTTAGAAG 

35 

GAACAGGTAT TGCACAACGT GATGTGACGA 
aga*£Agaagg CTTTGGGGAA TTGTTTAGAT 
GTGCATTATT GTCTCGTGCT GTAGCAGGTA 

40 

CAGGATCAAC AGGCGCAGTT AAATTAGCAT 
ATCTGATTCA TGAGCTTACA AAATAATTTA 

45 TTTACCGCCA GACTTGCTTT CAAGGTAGGT 

TTTCGTCATG TCGTAAATGG TTAAAGCCGT 
AACACCGGTT TTGCCAGTTG TAGAGACAGT 

SO ATTTGTTTCA TCCCAGCTGA AGTGAACATC 

CGGAATAAGT GTTGATGTAT TTTTGGCAGC 

55 



ATAGAGATTA TATTGAATTT AGTAATTTAC 6960 

CTTTGAAAAT GATGCCTAAG GTGGTTGCAG 7020 

ATGTTGATAT TGATGATTAT ATTGCCCATG 7080 

AGGACGTTGA CGTTATTATT GATGCAACCG 7140 

ATTTTGCATA TAAATATCGT ATACCTTGGA 7200 

CAGAAGCTGC ATTTATACCT GGTAAAACAC 7260 

CAGCATTAAA TTTAACATGT GATACAGTAG 7320 

CAAGTTTACA ATTAAGAGAT GCGATGAAAG 7380 

TAACTTATGG CGATATTTGG GAAGGTAGTC 7440 

GTTCAGACTG TACAACTTGT GGAGATGTAC 7500 

AACGTTATGC AACATTGTGT GGTAGAGACA 7560 

ACGACATTCT TGTTCAATTT TTAAAACAAC 7620 

TGGTTATGTT TGAATTTAAA GGACACCGCA 76 BO 

TACATGGCAT GACACGCACA TCAGATGCCA 774 0 

AAAAGATAAG ACAAAAGGAG TGTAATATTA 7800 

GTACAGTTAA AGCAGC CGT A CTAACGGTAT 7860 

GTGGTCAATG CGTGCGCCAA CTATTACAAG 7920 

ATACAATTGT GAAAGATGAA AAAGTAGCCA 7980 

AAGATATTGA TGTCATCATT ACGACTGGTG 8040 
TTGAAGCAGT AAAACCACTT TTAACTAAAG J ^ 8100 

ATTTGAGTTA TGTTGAAGAT GTTGGCACGC 8160 

CAGTTAATAA TAAATTGATA TTTTCGATTC 8220 

TAGAAAAGCT CATTAAACCA GAATTAAATC 8280 

TTGATTTGAT TGGCGTTGAA AATCTCCAGA 8340 

TTCGCCAATA ATCATACCTT TATCAACTGC 8400 

TGCTGATGCA GCGGTTAAAG CTTCCATTTC 8460 

TGTTTGAATG TTTAAAGTAT AAAGGGGTGC 8520 

TATGCCAGTC AATGGTAATG GATGGCACAT 8580 

CATAATACCA GCGATTTGAG CAGTGTTCAA 8640 
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AATGCTTGAA 


TGAGCGACAG 


CAGTTCTTTT 


TGTAATTTGT 


TTGTCTGATA 


CATCGACCAT 


8760 


TTTGGCGTGG 


CCTTGTTGAT 


TAATATGAGT 


AAACTCAGTC 


ATTTTACC CC 


TCCTAGTGCA 


8820 


TCTAGTATAT 


CATGAAAAAA 


TAAAAGTTTT 


GGAGATGATT 


TTTAATGGTA 


GTAGAAAAAA 


8880 


GAAACCCAAT 


CCGAGTTAAA 


GAAGCAATTC 


AACGTATCGT 


TAATCAGCAG 


AGTTCAATGC 


8940 


CGGCAATTAC 


GGTAGCACTT 


GAAAAAAGTC 


TAAATCATAT 


CTTAGCAGAA 


GATATTGTAG 


9000 


CTACTTATGA TATACCAAGG 


TTTGATAAAT 


CACCTTATGA 


TGGTTTTGCA 


ATTCG CAGTG 


9060 


TTGATTCACA 


AGGGGCAAGT 


GGTCAGAATC 


GCATTGAGTT 


TAAAGTGATT 


GATCATATTG 


9120 


GTGCAGGTTC 


AGTTTCTGAT 


AAATTAGTTG 


GGGATCACGA 


AGCGGTGCGT 


ATTATGACTG 


9180 


GAGCACAAAT 


ACCTAATGGC 


GCAGATGCTG 


TTGTTATGTT 


TGAACAAACG 


ATTGAA (TAfJ 
x x unn^. x n\i 


924 0 


AAGATACATT 


TACAATTCGT 


AAACCATTTT 


CAAAAAATGA AAATATATCT 


X X AfUlftww X w 


7 J V U 


AAGAAACAAA 


GACAGGCGAT 


GTTGTTCTAA 


AAAAAGGACA 


AGTAATTAAT 


Va-nwwww^ X f\ 




TCGCGGTCCT 


TGCAACATAT 


GGCTATGCAG 


AGGTTAAAGT 


TATTAAGCAA 






CTGTTATTGC 


AACAGGAAGC 


GAATTATTAG 


ATGTTAATGA 


TGTATTAGAA 






TTCGTAACTC 


TAATGGCCCA 


ATGATTCGTG 


CCTTAGCAGA 


AAAATTAGGT 


Wx Xl2/vVwX X\9 




GTATTTACAA 


AACACAAAAA 


G ATG ATTT AG 

v/l X VJXl XXX 


illnv X wV7^»rt 1 


LUinu X X 1 


A A Af2 & AG/TA 


3DUU 


TGGAAAAACA 


TGATATCGTT 


ATTACAACGG 


GCGGAGTTTC 


TGTTGGAGAT 


XXX waV* X a X X 




TACCTGAGAT 


TTATAAGGCT 


GTAAAGG CGG 


AAGTGTTATT 


TAATAAAGTA 


VJV«rt X w^w X 


Q7 9 n 

7 / A u 


CTGGTAGCGT 


AACAACGGTT 


G CATTTGT AG 


ATGGaAAGTA 


TTTGTTTGGa 




7 / OU 


ATCGATCAGC 


TTGTTTTACA 


GGATTTGAAC 


TATTTGTGAA 


nCCAGCTGTT 


AAACATATGT 


9840 


GTGGCGCACT 


AGAAGTCTTC 


CCGCAAATAA 


TTAAAGCAAG 


ATTAATGGAA 


G ATTTTAC CA 


9900 


AGGGAAACCC 


ATTCACACGA 


TTTATACGTG 


CTAAAGCAAC 


GTTAACAAGT 


GCTGGAGCTA 


9960 


CTGTAGTACC 


TTCAGGATTC 


AATAAATCAG 


GTGCGGTTGT 


AGCGATTGCA 


CATGCTAACT 


10020 


GTATGGTCAT 


GTTACCAGGA 


GGGTCACGTG 


GTTTTAAAGC 


GGGGCATACA 


GTAGATATTA 


10080 


TATTGACTGA 


ATCTGACGCT 


GCTGAAGAGG 


AACTTCTTTT 


ATGATTTTAC 


AAATTGTAGG 


10140 


TTACAAAAAG 


TCTGGTAAGA 


CAACATTGAT 


GAGG CAT ATT 


GTCTCTTTCT 


TAAAGTCACA 


10200 


TGGTTATACA 


GTTGCTACTA 


TTAAACATCA 


TGGGCATGGT 


AAGGAAGATA 


TTCAATTACA 


10260 


GGATTCAGAC 


GTCGATCACA 


TGAAGCATTT 


TGAAGCGGGG 


GCAGATCAAA 


GTATTGTACA 


10320 


AGGTTTTCAA 


TATCAGCAAA 


CTGTAACACG 


TGTAGATAAT 


CAAAATCTTA 


CTCAAATTAT 


10380 


TGAAAAATCT 


GTTACAATTG 


ACACCAATAT 


CGTATTAGTT 


GAAGGCTTTA 


AAAATGCTGA 


10440 
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GAATGTTTGT TATAGCATTA ATGTAAGGGA 
GTTATTAAAT AAAATTAAAA ATGATTGTGA 
5 TGAAACAATT TGAAATCGTG ACAGAACCGA 

T AAATGAAT A TCAAGGTGCA GTAGTTGTTT 
GCGTCAAAAC GGAATATTTA GAATATGAAG 

10 

CACAAATTGG AGATGAAATA AATGAAAAAT 
GAATAGGGCC ATTACAAATT TCAGATATCG 
GTAAAGATGC CTATCGAGCA AATGAATATG 

15 

TTTGGAAAAA AGAAATTTGG GAAGATGGTT 
ATGAAGAAGC AAAGAGGGAG GAATAAGAGA 
AGATATATTA CAAAAAGCAC AGGAAGATAT 

20 

ATTTGAAGAT TTATTGTTTG AACGTTATCC 
TGTAAATGAG GAATTTGTAC AAAAATCGGA 

25 AATTCCACCG GTTAGTGGAG GTTAAGGGAG 

TTCAGTG CGA TTTGGTAAGC CCAAAGCTTT 
TAGAGTAATT AAGACATTAG AATCAACAAA 

30 TGCGCAATTG GCAACGCAAT TTAAATATCC 

TGATAAAGGT CCATTAG CAG GAATTTATAC 
GTTTTTTGTC GTTTCTGTTG ATACACCAAT 

55 TCAGTTTTTA GTTTCTCATC TTATTGAAAA 

TGGACGTTTT ATTCCAACAA TTGCATTTTA 
AG CACTACAT TCTGATAATT ACAGTTTTAA 

40 

TTTGGATGTA AGGGATGTAG ATGCGCCCTC 
TGATTTGGAC GCTTTAATTC AAAAATTGTA 
AAATAAAAGA TAAACTAGGA CGTCCCATCC 

45 

GTAACTTTAG GTGTGATTAT TGCATGCCTA 
TACCTAAAAA TGAACTTTTA ACGTTTGATG 
60 AATTAGGTGT AAAAAAAATA CGCATTACAG 

ATGTACTTAT AGCTAAATTA AAT CAAAT CG 



GCATGAAGAT TTTACAGCAT TTGAGCAATG 10560 

TACACAATTA ACATAGAGGA TTGAAATGAA 10620 

TACAAACAGA ACAATATCGT GAATTCACTA 10680 

TTACCGGTCA TGTTCGCGAA TGGACTAAAG 10740 

CGTATATTCC AATGGCTGAA AAGAAATTGG 10000 

GGCCTGGAAC GATAACGAGT ATTGTTCATA 10860 

CTGTATTAAT TGCGGTTTCT TCACCGCATC 10920 

CAATTGAGCG TATAAAAGAA ATTGTTCCGA 10980 

CAAAATGGCA AGGGCATCAA AAAGGGAATT 11040 

GATGAAGGTA CTTTACTTCG CAGAAATTAA 11100 

TGTGCTTGAA CAAGCATTGA CTGTACAACA 11160 

GCAAATCAAT AATAAAAAGT TTCAAGTTGC 11220 

TTTCATTCAA CCTAATGATA CTGTTGCATT 11280 

CATGAAAGCA ATAATTCTTG CAGGTGGTCA 11340 

TGCGGAAGTG AACGGTGAGA C CTTTT AT AG 11400 
TATGTTCAAT GAAATTATTA TTAGTACAAA . 11460 
AAATGTTGTT? ATAGATGATG AGAATCATAA ■ 11520 

AATCATGAAG CAACATCCTG AAGAAGAATT 11580 

GATTACTGGT AAAGCTGTAA GCACGTTGTA 1164 0 

TCATTTAGAT GTCGCAGCTT TTAAAGAAGA 11700 

TAGTCCGAAT GCATTAGGCG CTATAACTAA 11760 

AAATGTATAT CATGAATTAT CAACGGATTA 11820 

ATATTGGTAC AAAAATATAA ATTATCAGCA 11880 

AGCTGTTAGG AGGTC CACAA ATGGTAGAAC 11940 

GTGACTTACG GTTATCTGTG ACAGATCGGT 12000 

AAGAGGTATT TGGAGATGAT TTCGTATTTT 12060 

AAATGGCTAG AATCGCTAAG GTATATGCAG 12120 

GTGGAGAACC ATTGATGCGA CGGGATTTAG 12180 

ATGGTATTGA AGATATTGGT TTGACTACAA 12240 
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ATGTCAGTTT GGATGCTATT GATGATACGC 
AAGCGACTAC GATTTTAGAA CAAATTGATT 
5 TAAATGTTGT TATACAAAAA GGTATTAACG 

TTAAAGATAA ACATATAGAG ATTCGATTTA 
GATGGGATTT CAGTAAAGTT GTAACTAAAG 

10 

TTGAAATCGA TCCTGTAGAA CCAAAATATT 
AGGATAATGG TGTTCAATTT GGTTTGATTA 
GTACACGCGC AAGGCTGTCA TCAGATGGGA 

15 

ATGGATTTAA CGTTAAAGCG TTTATTCGTT 
AATTTAAAGC TTTATGGCAA ATAAGAGATG 
CAGTTGCCAA TCGTCAACGT AAAAAGATAA 

20 

GGACCACTAC ATATTAAATC ATTAGAGATG 
ACAATATTAT TTATTAAAGT AAAAACGGTC 

25 GTTTTTAAAG TTTTTACAAG TTGGCGGGGC 
TACAATAATG TGCAAGTTGG CGGGGCCCCA 
GACAATGCAA GTTGGGGAAC GGGGCCCCAA 

30 TAATGTGGAA GTTGGCGGGG CCCCAACATA 
GCAAGTTGGG GATCAACGAA ATAAATTTTA 
AATCACTACA TAATAAATCT TTAGTGGTTC 

35 GAGTTGTAAT ATATCTTTTT TAGGTATAAA 
AGATATAAAT CTAAACAAGA TATAGCCAGC 
AGTTTGATAT ATAATAAATT TAAGTAATTG 

40 

AGAAACATAG GAGGCATCAT ATTATGAGTA 
GGGAGTTAAG TCAGTTAAAG CACTGGTTAA 
TTGTAGTCCT TTTTAAAGTG TATGAAGCTG 

45 

CATTACATTT TGAAATGCTA TGGGATACAA 
ATAAAAAAGA GCTTATTTCT AAATTGCGTT 
$0 TCTATAGTAC TTCTCAAAAG AAATTGTTAG 
GCGTTACAAA CTAAAAACTT aAAAAgcaTG 



TATTTCAATC AATCAATAAT GGTAATATTA 12 360 

ACGCGACGTC TATTGGTTTG AATGTAAAAG 12420 

ATGATCAAAT CATACCAATG CTTGAATATT 12480 

TAGAATTTAT GGATGTTGGT AATGATAATG 1254 0 

ATGAAATGCT TACAATGATA GAGCAGCACT 12600 

TTGGGGAAGT AGCAAAATAT TATCGCCATA 12660 

CAAGTGTTTC ACAATCATTT TGTTCTACAT 12720 

AGTTTTACGG ATGTTTATTT GCAACTGTCG 12780 

CTGGCGTGAC CGACGAAGAA TTAAAAGAAC 1284 0 

ATCGATATTC AGATGAGAGA ACTGCTCAAA 12900 

ACATGAATTA TATTGGTGGT TAATGTGTAG 12960 

TTTT AAT ATT TCTGTCTTAC TCCCTAAAAT 13020 

ATATCTATGC CAGATTTAAT AGAAATGATC 13 080 

CCCAACACAG AAGCTGACAG AAAGTCAGCT 13140 

ACATAGAGAA TTTCAAAAAG AAATTCTACA 13200 

CACAGAAGGT GACGAAAAGT CAGCATACAA 13260 

GAGAATTTCA AAAGAAATTC TACAGACAAT 1332 0 

TGAGAATATC ATTTCTATCC CACTCTTAAG 133 80 

TTTAACATTG ATGTCACACT CCATGCCATT 13440 

TGTTGTCGAA TAAAGAACAA GTTGTCCAAA 13500 

AATTTAATAT TTGTAATAGA TAAAATGCTA 13560 

TATAATAATA TGAATTACAA ACATCTAAGA 13620 

ATAAAGTTCA ACGTTTTATA GAAGCAGAAA 13680 

AAACAACACA TAAGATTTCA ATTGAAGAAT 13740 

AAAAGATTAG CGGTAAAGAA TTGAGGGATm 13 800 

GTAAAATCGA TGTGATTATC CGTAAAaTCT 1386 0 

CTGAAACGGA TGAAAGACAA GT ATT CT ATT 13 920 

ATAAAATTAC TAAAGAAATA GAAGTGTTAA 13980 

CCAATCTCTA TTCATCATAA TTGCGTCTTG 14040 
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GTTCATGGCA TTTCTAGTTA CATGACGTCC ATGAATTAAG AAGTAAACAA GCATAGTAAT 14X60 

GATTGCTAAA GCGGCCATAA AGCCGAAGAT TTCACTATAT GAAAACATAT GAGTAAATAA 14220 

CCCAAGGAAT GATGGACCGA AGCCGACACC TGCATCTAGA CGAACGTAAA AAGTAGATGT 14280 

CGCGATACCA TATTTAATCG GGGGTGAGAC TTTTATCGCA ATAGATTGCA TTGCAGATGA 14340 

TAAATTTCCA TACCCTAAAC CTAGGCAAGC ACCAGCAAGT AATATTAACC AGCTTTGATA 14400 

GCTTGAAATT AAGCATACAA ATGAAAGGAA AAGCATGATA AATGCTGGGT AGACAATAAT 14460 

ATTTTCATTT TTATCATCCA TCAATCTACC AGCAATAGGT CTAGTAATTA ACGATGCTAT 14520 

AGCATAGCAA ATAAAGAAAT AGCTTGCTGC AGTGACTAGG TGTCGCTCTA AAGCAAATGC 14580 

TTGTAAATAA GTTAGGATGG ACGCATAGGT AACGCCAATT AAAAGCATAA TTACAGCAAC 14640 

AGGAATGGCC TCTTTTGCAA TAAATTGATG AATACTAAAT CTTGGTTTAT CAATGACATT 14700 

AGTTTCAGTT TTGTTATTTG TTACTTCGAA ATCAACTTTT ATAAATAATG AGATAATGAG 14760 

TCCGAGTATG CCTAATATGA CACAAATAAT AAACAGTAAG TCAATTGCGT ATTTTGT AAT 14 820 

AAGTAACATG CCTAGAAATG GGCCAATCGC TGTACCTAAT ACTAAACTTA AGGAAAATAA 14 8 90 

ACTGATGCCT TCACTTTTTC TATTAACAGG GGTAACGTAT GCCGCAATAG TACCTGTTGC 14 94 0 

AGTTGTCACA ACTGCAGTTG CGATACCGTT TATGAGACGT ACAAAGATTA AAAAAGCTAA 15000 

AGATCCATCA ATAAAATAAA GTAATTGCGT GATAATTAAA GCAATTAAAC CAATAAATAA 15060 

TAATCGTTTA GGTCCrATTT SATTTACAAA TTTACCTGTA , GCAAATCGA 15109 

(2> INFORMATION FOR SBQ ID NO: 45: 

SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 9072 base pairs 

V (B) TYPE: nucleic acid . 
(C) STRANDEDNESS : double 
<D) TOPOLOGY: linear v 



40 



45 



50 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

GAGAGTCAAT GGCAAGAAGA ATATAAATAT TTGAGAGCGT TAATCTTTAA TGAAACAGAA 60 

TTAGAGGAAG CGTATAAATG GATGCATCCT TGTTACACGT TGAATAATAA AAATGTAGTA 120 

CTTAT CCATG GCTTCAAAAA TTATGTTGCA CTATTATTTC ATAAAGGTGC CATTTTGGAG 180 

GATAAATATC ATACACTCAT TCAACAGACT GAAAAGGTGC AAGCAGCTCG TCAGTTACGA 24 0 

TTTGAAAATT TAACAGAGAT TCAAGCACGT ACCGAAGAAA TTAAATATTA TCTAGCCGAA 300 

GCAATTAAAG CTGAAAAAGC TGGTAAAAAA GTTGAAATGA AGAAAACAGA GGAATATGTT 360 
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AAATTAACGC CAGGCAGACA ACATCAATAT ATATATCATA TTGGACAAGC TAAACGCAgT 480 

GgAACAAGAC AAAAGCGTGT TGAAAAGTAT ATTAACCAAA TACTAGAAGG TAAAGGGATG 540 

CATGATAAGT AATTAATGAG TAAAGCATAC CGGTTATACA ACAACATACA AGATGACACG 600 

AAACAACCAA TGGCTCATGC TGTTGGTTGT TTTTTTAGGT GTGTCTGTCA TGGGCAACAC 660 

TTTGACGTTG GAATTCCGTT ACAGGCTTGG GAGTAGAAAA TGTTAGCAAA AGGCAAGGGT 720 

GTCTACAATG AATGATGAAG ATATTAAAAT ATAAGGATGA CTTTGTGAGT GGCGGATGGG 780 

CGGTTGTCCG TCTGTAACAA TGGATGCGTG TGCATTATTA CAAAAATTCG ACTTTTGTAA 84 0 

TAAtATTTCA CATTTTCGAC ACTTTTTTGC TATAAAACAA CCAATTGAGC GATAATAAAT 900 

TCGCTTTTAA AAAATATGAG TTATCTATTT AGTTGCCAAA GATAAAATAA TAATGTTTAA 960 

TAACATCATA TAGAGTATGT TAGTTTTAAA TGTCGAATAT ACGAATGTGc AAACAAAGTA 1020 

ATCGGTAGAA ATTCAACATA CATAGCGCCG TTTACTGTTA AGTATTCACA TTACAGATGA 1080 

AAAATATAAA ATTCTACATA AT CAAGACCA TGATGTGTAG TTGTTTAACT TATGACTCTA 114 0 

TTTGTTTAAC AATTG CG AT A ATGGTCTTTT TATTTTATGC GTATCATTCG TCATATTTTT 1200 

2S TATGAGGAAG GAGAAATGAT TATGTTAAGT ATTAAGCATT TAACGAAAAT TTATTCTGGT 1260- 

AATAAAAAGG CAGTAGATGA CATCTCTTTA GATATTCAAT CTGGGGAATT TATCGCATTT 1320 

ATTGGAACCA GTGGAAGTGG CAAAACGACT GCTTTAAGAA TGATAAACCG TATGATTGAA 1380 

30 GCGAGAGAAG GACAAATTGA AATTGATGGT AAAGATGTTC GGAGTATGAA TCCTGTCGAA 144 0 

TTGCGTAGAA ATATTGG CTA TGTTATTGAA CAAATTGGCT TAATG CCTGA TATGACGATT 15 00 

AAAGAGAATA TTG TGTTGGT ACCCAAATTG TTGAAATGGA CTAAAGAGGA AAAGGATAAA 1560 

35 CGTGCAAAGG AATTAATTAA ACTTGTGGAT TTACCGGAGT CATTTTTAGA GCGTTATCCA 1620 

GCACHVACTAT CAGGTGGGCA ACAACAACGT ATCGGTGTTG TAAGAGCACT TGCGGCCGAA 1680 

CAAGATATTA TTTTAATGGA TGAACCTTTT GGTGCATTGG ATCCTATTAC GAGAGATACG 1740 

40 

TTACAAGATT TAGTTAAAAC GTTACAACGA AAATTAGGCA AGACGTTTAT CTTTGTAACA 1800 

CATGATATGG ATGAAGCGAT TAAATTAGCA GACAAAATTT GTATTATGTC AGAAGGTAAG 1860 

GTGGTGCAAT TTGATACGCC AGACAATATT TTAAGACATC CCGCAAATGA TTTTGTACGT. 1920 

GATTTTATAG GACAAAATAG ACTGATTCAA GACCGTCCCA ATGACAAGAC TGTAGAAGGT 1980 

GTAATGATTA AACCAATCAC GATACAAGCA GAAGCAACAC TGAATGACGC CGTTCATATT 2040 

ATGAGACAAA AACGTGTTGA TACTATTTTT GTAGTAGATA GTAATAACCA TTTACTAGGT 2100 

TTCTTAGACA TTGAAGATAT AAATCAGGGT ATACGTGGAC ACAAAAGTTT ACGAGACACC 2160 
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10 



15 



20 



ATTTTAAAAA GAAACGTTAG GAATGTACCT GTCGTAGATG ATCAACAGCG TTTAGTAGGA 2280 

CTGATTACGC GTGCCAATGT TGTTGATATT GTATATGACA CGATTTGGGG CGATAGTGAG ■ 2340 

GATACAGTGC AAACAGAACA TGTGGGGGAA GACAcTGCGT CCTCAAAAGT GCATGAGCAA 2400 

CACACTACTA ATGTCAAAGT ACGTGACATA GGAGATGATA AATCATGATT GAGTTCCTAC 2460 

ATGAACATGG TGGACAGTTG ATGTCGAAAA CACTGGAACA TTTCTATATT TCTATAGTGG 2520 

GATTATTACT TGCCATCATT GTTG CAGTAC CTATAGGCAT TTTATTATCA AAAACAAAGC 2580 

GAACTGCCAA TATTGTATTA ACTGTGGCAG GTGTCTTACA AACTATTCCA ACACTAGCTG ; 2640 

TACTTGCTAT TATGATACCG ATTTTTGGTG TTGGTAAAAC GCCTGCAATT GTAGCGCTAT 2700 

TTATTTATGT ATTATTACCT ATTTTAAATA ACACGGTACT CGGTGTTCAA AATATTGATA 2760 

GCAACATTAA AGAAGCTGGA AAAAGTATGG GAATGACACA ATTTCAATTG ATGAAGGATG 2820 

TTGAATTGCC GTTAGCATTG CCGCTTATCA TTGGTGGCAT TCGTTTGTCA TCTGTGTATG 2 880 

TAATTAGTTG GGCTACACTT GCAAGTTATG TAGGTGCGGG TGGATTAGGT GATTTCATTT 294 0 

TCAATGGTTT AAATTTATAT GATCCACTGA TGATTGTAAC TGCAACGGTA CTCGTTACTG 3000 

25 CACTAGCATT AGGTGTTGAT GCCTTATTAG CTTTAGTTGA AAAATGGGTA GTTCCCAAAG - 3 06 0 

GCTTAAAAGT ATCTGGATAA TTAGGAGGCT AAGATAATGA AGAAAATTAA ATATATACTT 3120 

GTCGTGTTTG TCTTATCGCT TACCGTATTA TCTGGATGTA GTTTGCCCX3G ACTAGGTAGT 318 0 

30 AAG AG CACGA AAAATGATGT CAAAATTACA GCATTATCAA CAAGCGAATC GCAAATTATT 3240 

TCACATATGT TACGGTTGTT AATAGAGCAT GATACACACG GTAAGATAAA GCCAACATTA * 3300 

GTAAATAATT TAGGGTCAAG TACGATTCAA CATAATGCCT TAATTAATGG -GGATGCTAAT 3360 

35 

ATATCAGGTG TTAGATATAA TGGCACAGAT TTAACGGGAG CTTTGAAGGA AGCACCAATT ■ ,3420 

AAAAATCCTA AGAAAGCAAT GATAGCAACA CAACAAGGAT TTAAAAAGAA ATTTGATCAA 3480 

ACGTTTTTTG ATTCGTATGG TTTTGCGAAT ACGTATGCAT TCATGGTAAC GAAGGAAACC 3540 

GCTAAAAAAT ATCATTTAGA GACAGTTTCA GATTTAGCAA AGCATAGTAA AGATTTACGT 3600 

TTAGGTATGG ATAGTTCATG GATGAATCGT AAAGGCGATG GCTATGAAGG ATTTAAAAAA 3660 

GAGTATGGTT TTGACTTTGG TACAGTGAGA CCAATCCAAA TAGGTCTAGT CTACGACGCA 3720 

TTAAACTCAG AG AAGTTAGA CGTTGCATTA GGTTATTCTA CAGATGGTCG AATTGCGGCG 378 0 

TATGATTTGA AAGTACTTAA AGATGATAAA CAATTTTTCC CACCTTATGC TGCGAGTGCT 3 840 

GTTGCAACAA ATGAATTATT ACGGCAACAC CCAGAACTTA AAACGACGAT TAATAAGTTG 3 900 

ACAGGAAAGA TTTCGACTTC AGAGATGCAA CGCTTGAATT ATGAAGCGGA TGGTAAAGGT 3 960 
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25 



30 



35 



45 



SO 



AAAGGTGGTC 
CGAACTTTGG 
TGCTGTTTGC 
TTTCTGGATT 
TAGCTATTTT 
TATATGCGTT 
ATATTAAGGA 
AATTACCGTT 
TAGGTGTTGT 
GTGGTACAAA 
TCATTGCAAT 
CACGACATOG 
AGATGTTTAT 
GCTCAAATAA 
TAGCAATTAT 
AAAATGAGAT 
ATATTAAAAA 
ACAATTTAGA 
GCTATTCGCG 
CGTTGGAACG 
AGGTGCTGGT 
ATTAAATGTT 
AGGTAGAATA 
TAACGAATAT 
TTGGGGTGGT 
GCGTTCAATA 
TGCACCTGAG 
CGTTATTGGT 
GACGTCTCAT 



ATAAGTAATG 
TTATCTATGG 
AgCTTTAATT 
TGTAATTACA 
AATGTTAGTC 
ACTTCCAATT 
TGCTGGCAAA 
ATCTGTTTCG 
TGCCGTTGGA 
TGCGACGGAT 
CGTCATTGAT 
TAAAAATCAA 
AATTTAGCGA 
TCTTTGAGTA 
T ATCATG AAA 
GTTTTATTTA 
TTTGTTTTTC 
GGTGACAACC 
CAGTCACTTG 
ATAAATATCG 
GATGTTGCTG 
GTAGGTTCAT 
ATTCAAGGTT 
TATATTGGTA 
AGTGGTATTT 
TTTGTTGTTT 
ACTAAAG CAG 
TTAGTCATTT 
TTTGGTTTAG 



GAAGGTAATT 
GATTTATTTT 
GG TATTC CAT 
ATTGCAAATA 
ATGGGCTTAG 
ATAAAAAACA 
GGTATGGGAA 
GTTATTATCG 
TCATTTATAG 
GGCACAACGT 
GTACTATTAA 
TCTAATCATC 
TTTCGTTTCA 
GCCTTTTTAT 
GTTTTTGGAT 
TAATTTTCTG 
TTAAACATAG 
GATTGCTTTT 
TTAATCTTGT 
CTGTTAGCTT 
ATAAATTTGG 
TACTCATCAT 
TGTCTGCAGC 
CAAGAAGACA 
GTACGTTGTT 
CAATTCTATT 
AACCAATCAA 
TAGTAGTGAC 
TTTCACCGTT 



TATTACAGCA 
TCAAACACTT 
TGGGAATCTT 
TAATTCAAAC 
GTTCAGAAAC 
CTTATACTGG 
TGACACGCAA 
GTGGCATTCG 
GAGCACCTAC 
TTATTTTAGC 
GATTTTTAGA 
GGCCGCAAAG 
TGATTTATAA 
AGGTTGTGTT 
AAAAAGCGTT 
CAAATTTATG 
GAGGCTTATC 
GGGTATCGTT 
TGTCCCATTA 
ATCTGCCTTA 
TCGCGTCAAA 
CATTACACCT 
ATGTATTATG 
ACGTGCCTTA 
TGGTGGCTTA 
AACATTATTA 
AGGTATGAAA 
GATGTTAAGT 
AATTCTAGGT 



ATTATTCAAT 
ATTAATGTCT 
GCTTGCaAGA 
AGTTCCAGTC 
AGTAGTTTTA 
TATAGCTAGT 
TCAAGTGCTA 
TATTGCCTTG 
GCTTGGTGAC 
AGGTGCGATT 
AAAACGATTA 
TATTAATATG 
AAAATGAGGC 
TGTATGCGTT 
AATTATTGTA 
ATATTGTTTC 
TAATTCATGG 
TTAGGGGTTA 
CAATCAACAT 
TTTGCTGGTT 
ATTACTTATG 
TTG C CAGC AT 
CCATCAACAC 
AGCTATTGGT 
ATGGGTACAT 
GCAATGTACT 
GCAGAAGCTA 
TTAAATGTAA 
TTAATTGTTG 



TATTATGTTA 
GTCTATGGTG 
TACACAAAAC 
ATTGCAATGT 
ACAGTGTTTT 
GTTGATGCGA 
CGAATGATTG 
GTTGTTGCGA 
ATTGTGATTC 
CCGATTGCTA 
G AC C CAACAA 
TAATAGTAGA 
TACTCAAGGA 
TACACTAAAA 
AAAATACTAA 
TTAATATATC 
ACACATCAAA 
TTACCTTTTG 
ATAGTAGTGA 
TGTTTATCGT 
TAGGATTGAT 
TTTTAATTAT 
TTGCTATTAT 
CTATTGGTTC 
ATATAGGTTG 
TAATCAAACA 
AAAAGTTTGA 
TCATCACACA 
TGTTTATCTG 



4080 

4140 

4200 

4260 

4320 

43B0 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 

4980 

5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 

5520 

5580 

564 0 

5700 

5760 
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AATTTTTAAA AATAGAGGAT ACAGTGGTGC AACTATTTCA AACTTCTTAT TAAATGGTGT 5880 

AGCAGGTGGT GCACTTATCG TTATTAACAC GTATTATCAA CAACAATTAG GATTTAATTC 5940 

5 TTGGCAAACG GGTTATATTT CATTAACGTA TTTAATAACA GTGTTGTCAA TGATTCGTGT 6000 

AGGTGAAAAG ATTTTATCTC AACATGGTCC GAAGCGCCCA CTATTACTAG GAAGTGGCTT 6060 

TACAGTGATT GGGTTAATCT TATTGTCGTT AACATTTTTA CCAGAAGTGT GGTATATCAT 6120 

10 

ATCTAGTATA GTTGGATATT TATTGTTTGG TACTGGTTTA GGATTATATG CTACACCATC 61 BO 

AACTGATAGA GCAGTTGCTA GTGCGCCAGA TGATAAGTCG GGTGTTGCTT CAGGTGTGTA 624 0 

TAAAATGGCG TCATCATTAG GAAATGCATT TGGAGTAGCA GTATCTGGTA CGGTTTATAC 6300 

15 ' , ' 

TGTGTTAGCA GCTAATTTAA ATTTGAACTT AGGTGGTTTC , ACAGGTATGA TGTTTAATGC 6360 

CTTGCTAGCA ATTGTTGCAT TTTTAGTCAT TTTACTATTA GTTCCTAAAA ATCAAACGAA 6420 

TTTGTAAAAC TGAAATGAAA GCAAGTTATT ATGTAGGGAT TTTAAAGGAA ATTTTGTGAA 64 80 

20 

AGTAAGTTTA TCATACACAC TTAATGTTGC GTATTGACGT TTAATGTTAG GTGTGTTCTT 6540 

TTATAGACGA TAAAAGCTGT GTGCATATTA AGOGAATGAT TTTCAAATTG ACGCTAATAT 6600 

25 GCGAAAGTAG TATTTTTAAA ATGAACAACA ACGATGAAGA GGGGTTTATA GGATGAAAAT 666 0 

TGCAATTGCT GGATCGGGTG CATTAGGTAG TGGCTTTGGT GCCAAACTAT TTCAAGCAGG 672 0 

ATATGATGTC ACACTTATTG ACGGATATAC ATCTCATGTT GAAGCGGTTA AGCAACATGG. 67 80 

30 ATTAAATATA ACGATTAATG GAGAGGCATT CGAGTTAAAC ATT C CGATGT ATCATTTTAA 684 0 

TGATCAACCG GACGAAAGCA TTTACGATGT TGTCTTTCTA TTTCCAAAGT CTATGCAATT 6900 

AAAAGAAGTG ATGGAAGATA TGAAGCCACA TATTGATAAT^GAAACGATCG , TCGTATGTAC 6960 

35 GATGAATCGT CTGAAGCATG AAGAAGTCAT TGCGCAGTAT GTTG CTCAAT CACAAATTGT 7020 

CAGAGGTGTT ACGACTTGGA CGGCAGGTCT TGAAAGCCCT GGACACAGTC ATTTACTTGG 7080 

TAGTGGACCA GTTGAAATAG GTGAACTAGT GGATGAAGGT AAAGAAAATG TTATAAAAGT 714 0 

40 

TGCTGATTTA CTTAACGAAG CGGAATTGAA TGGTGTCATT AGTAAAGATT TATACCAATC 7200 

GATTTGGAAA AAGATTTGTG TTAATGGTAC GGCAAATGCA TTAAGCACAG TGTTGGAGTG '7260 

TAATATGGCA TCGCTGAATG AAAGTAGTTA TGCGAAGTGT TTGATTTATA AATTAACGCA 7320 

45 

AGAAATAGTG CATGTAGCGA CGATTGATAA TGTTCATTTA AATGTTGATG AAGTATTTGA 7380 

ATATTTAGTT GATTTAAATG AAaAAGTTGG TGCGCATTAT CCATCCATGT ATCAAGATTT 7440 

so AATTGTTAAT AATAGAAAAA CTGAAATTGA TTATATTAAT GGCGCAGTTG CAACATTAGG 7500 

TAAACAACGT CaTATTGAAG CGCCAGTCAA TCGCTTTATT ACTGATTTAA TTCATACTAA 7560 
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CAATCACGTG ATATTACGGT CATTATTAAG ATTGAAATGT AATAAATAAA GAACAGCAGT 7680 

AAGGTACTTT CAAATTGAAA TGATCTTGGT GCTGTTTTTC TTGATTGATC TTCGTCATAA 774 0 

5 TTCAGATTTG TCATAGGcTA CGACATAGTA TTAGTATTTA CTAGACAGTT TTTACGACGA 7800 

CACTTTGAAA AATTTTGAGG CAAATCATTT GGAAGTCTCA CGTGAATTTT GTAAACTCAT 7860 

CAAGCAAGTA ATTATATTAA AAAGACAAAT AGAGAAAAGG TGTTTATAAT GAGTAAAATT 7920 

10 

TTTGTAACTG GTGCAACGGG CCTTATTGGC ATTAAATTAG TTCAAAGACT AAAAGAAGAG 7980 

GGGCATGAGG TTGCTGGTTT TACTACATCT GAGAATGGTC AACAAAAGCT AGCTGCTGTT 804 0 

AATGtAAAAG CATATATTGG TGATATATTA AAAGCTGATA GTATTGATCA AGCGTTAGCA 8100 

15 

GATTTTAAAC CAGAAATCAT TATCAATCAA ATTACGGATT TAAAAAATGT TGATATGGCA 8160 

GCAAATACGA AAGTACGTAT TGAAGGTTCT AAAAACCTAA TTGATGCGGC GAAAAAGCAT 8220 

20 GACGTTAAGA AAGTAATTGC CCAAAGTATT GCCTTTATGT ATGAACGTGG CGAAGGATTA 8280 

GCAAATGAGG AAACTTCACT TGATTTTAAC TCAACTGGCG ATAGAAAAGT AACGGTTGAT 834 0 

GGTGTGGTTG GTTTAGAAGA AGAAACGGCT CGTATGGATG AATACGTTGT TTTACGTTTT 8400 

25 GGGTGGTTAT ATGGCCCAGG TACTTGGTAC GGAAAAGATG GGATGATTTA TAATCAATTT 8460 

ATGGATGGTC AAGTGACACT TTCAGATGGC GTAACATCAT TTGTGCATCT TGATGATGCA 852 0 

GTTGAAACAT CTATTCAAGC TATTCATTTT GAAAATGGTA TCTATAATGT AGCAGATGAT 8580 

30 GCACCTGTTA AAGGTTCTGA ATTTGCAGAA TGGTATAAAG AACAACTTGG TGTTGAACCA 864 0 

AATATTGATA TTCAACCTGC GCAACCATTT GAACGTGGCG TAAGCAATGA GAAGTTTAAA 8700 

GCGCAAGGTG GTACTCTGAT TTATCAAAGt TGGAAAGATG GCATGAATCC AATTAAATAA 87 60 

55 TAATTTATCC GTTTAATATA CAAAGAATAA AGACTTGGTC GAATCGTGGA TGATATATTA 882 0 

TCAAACGCAC GGCTCGAACA AGTCTTTTTT ATTATGTCTT CGTTATCTTT GTATGAAGGA 8880 

ATAACAGAAT TACAATTAAT GTACTGAATA ATGCAATTAA TGTTGTGATT AGTGCTAATT 894 0 

40 

TAATTTCTAT TGGTAGCCAA GTCAGTACAA AAGACCAATT ATTGCTACCG AGAATGAGAT 9000 

ATGGTAATGC ATATAATATG AGCGCTAAAG CGATACATAT ACATAATGAT AACCAACTCA 9060 

ATACAGCAAT CC 9072 

45 

(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 16826 base pairs 
so (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(XX) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 
GTGGAACAGC TGTAACTATA TCATTTCTTT CAACATTTAT TGGGAAAATG TTAG CTACAT . 60 

TTCTATATCC GATTAATAAT GTAGTACTTT CATATATnTC TGTAAATGAA AGTGACAATA 120 
TAAAGAAGCA ATATTTGaAA ACTAATCTAA TTGCTATAGC TGCCCTATGT TTAGTCATGA 180 
TTATATGTTA TCCAATTACA ATAATTATTG TCTCTTTACT GTATAACATT GATTCAAGTT 240 
TATATTCGAA GTTTATTATT TTAGGTAATA TAGGTGTTTT ATTCAATGCA GTGAGTATTA 300 
TGATCCAAAC TTTAAATACA AAACACGCAT CAATAACATT ACAAGCGAAT TATATGACGC 360 
TTCACACGAT TACATTTATA TTCATAACTA TTTTAATGAC AATTGCGTTT GGTCTAAATG 420 
GATTCTTTTG GACAACGCTG TTCAGCAACA TTATTAAGTA TGTGATTTTA AATATTATAG 4 60 

GTTTAAAGTC TAAATTCATT AATAAAAAGG ACGTCGATTA GATGAGTGAA AAAAAGATTT 540 
TGATTTTATG TCAGTATTTT TATCCGGAAT ATGTATCTTC TGCGACGTTA CCAACTCAAT 600 
TGGCGGAAGA TTTAATTGCG AATCACATTA ATGTCGATGT CATGTGTGGA TGGCCATATG 660 
AATATAGTAA T CAT AAACAG GTTTCTAAAA CCGAGATGCA TCGTGGTATT CGCATTCGAC 720 
2S GTCTCAAGTA TTCGAGGTTT AATAACAAAA GTAAGGTTGG AAGGATCATC * AATTTCTTTA 7 80 

GTTTATTTTC AAAATTCGTG ATTAATATAC CTAAAATGTT GAAATATGAT CAGATTCTTG 840 
TTTACTCTAA TCCACCAATC TTGCCATTAA TACCAGACGT TTTACACAGA CTGCTTAAGA 900 
30 AAAAATATTC TTTTGTGGTG TATGATATAG CACCTGATAA TGCGATTAAG ACAGGTGCAA 960 

CTCGTCCAGG TAGCATGATT GATAAGCTGA TGCGTTACAT TAATAGACAT GTCTACAAGA 1020 

ATGCTGAAAA , TGTCATTGTC CTTGGTACGG AAATGAAAAA CTACTTACTA AATCATCAAA 1080 

35 TTTCTAAAAA TGCTGACAAT ATCCATGTGA TTCCTAACTG GTATGACATG . CGTCAATTAC 1140 

AAG^CAATCG TATCTATAAT GACACATTTA AAGCTTACCG TGAGCAATAC GACAAAATTT . 1200 

TATTGTATAG CGGTAATATG GGG CAGTTAC AGGATATGGA GACACTTATC TCATTTTTAA 1260 

AATTAAATAA GGATCAGTCT CAAACGTTAA CAATACTTTG TGGTCATGGT AAGAAATTTG 1320 

CAGATGTCAA AACGGCAATA GaAGACCATC GTATTGAAAA TGTTAAAATG TTTGAGTTTT 13 80 

TAACAGGTAC AGACTATGCT GACGTATTAA AAATTGCGGA TGTATGTATT GCATCGCTGA 1440 

TTAAAGAAGG CGTCGGTTTA GGCGTGCCGA GCAAGAATTA TGGCTATCTT GCAGCTAAGA 1500 

AAGCGTTGGT ACTCATCATG GATAAGCAAT CTGATATCGT TCAACATGTT GAACAATATG 1560 

ATGCGGGTAT CCAAATTGAT AATGGCGATG CACATGCCAT TTATAACTTC ATCAACACTC 1620 

ACTCGAGTAA GGAATTGCAC GAGATGGGTG AGCGCGCACA TCAACTGTTT AAAGATAAAT 1680 
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AAGCGATTAT 


TCGATGTAGT 


GAGTTCAATA 


TATGGTTTAG 


TAGTTTTAAG 


TCCGATTCTG 


1800 




TTAATTACAG 


CATTACTAAT 


TAAAATGGAa 


TCACCTGGAC 


CAGCCATTTT 


CAAACAAAAA 


1860 


5 


AGACCGACGA 


TTAATAATGA ATTGTTTAAT 


ATTTATAAGT 


TTAGATCAAT 


GAAAATAGAC 


1920 




ACACCTAATG 


TTGCAACTGA 


TTTAATGGAT 


TCAACATCGT 


ATATAACAAA 


GACAGGGAAG 


1980 


10 


GTCATTCGTA 


AGACCTCTAT 


TGATGAATTG 


CCACAATTAT 


TGAATGTTTT 


AAAAGGAGAA 


2040 


ATGTCAATTG 


TAGGTCCTAG 


ACCAGCGCTT 


TATAATCAAT 


ACGAATTAAT 


CGAAAAACGT 


2100 




ACAAAAGCGA 


ACGTGCATAC 


GATTAGACCA 


GGTGTGACAG 


GACTAGCTCA 


AGTGATGGGG 


2160 


15 


AGAGATGATA 


TCACTGATGA 


TCAAAAAGTA 


GCGTATGATC 


ATTATTACTT 


AACACATCAA 


2220 


TCTATGATGC 


TTGATATGTA 


TATCATATAT 


AAAACAATTA 


AAAATATGGT 


TACTTCAGAA 


2280 




GGTGTGCATC 


ACTAATGAGA 


AAAAATATTT 


TAATTACAGG 


CGTACATGGA 


TATATCGGTA 


2340 


20 


ATGCTTTAAA 


AGATAAGCTT 


ATTGAACAAG 


GACATCAAGT 


AGATCAAATT 


AATGTTAGGA 


2400 




ATCAATTATG 


GAAGTCGACC 


TCGTTCAAAG 


ATTATGATGT 


TTTAATTCAT 


ACAG CAGCTT 


2460 




TGGTTCACAA 


CAATTCACCT 


CAAGCAAGGC 


TATCTGATTA 


TATGCAAGTG 


AATATGTTGC 


2520 


25 


TGACGAAACA 


ATTGGCACAA AAGGCTAAAG 


GTGAAGACGT 


TAAACAATTT 


ATTTTTATGA 


2580 




GTACTATGGC 


AGTTTATGGA AAAGAAGGTC ' ATGTTGGTAA 


ATCAGATCAA 


GTTGATACAC 


2640 




AAACACCAAT 


GAACCCTACG 


ACCAACTATG 


GTATTTCCAA 


AAAGTTCGCT 


GAACAAGCAT 


2700 


30 


TACAAGAATT 


GATTAGTGAT 


TCGTTTAAAG 


TAGCAATTGT' 


GAGACCACCA 


ATGATTTATG 


2760 




GTG CACATTG 


CCCAGGAAAT 


TTCCAACGGT 


TAATGCAATT 


GTCAAAGCGA 


TTGCCAATCA 


2820 




TTCCCAATAT 


TAACAATCAG 


CGCAGTGCAT 


TATATATTAA 


ACATCTGACA 


G CATTTATTG 


2880 


35 


ATCAATTAAT 


ATCATTAGAA 


GTGACAGGTG 


TGTACCATCC 


TCAAGATAGT 


TTTTACTTTG 


2940 




ATAC&TCGTC 


AGTAATGTAT 


GAAATACGTC 


GCCAATCACA 


TCGTAAAAGG 


GTATTGATCA 


3000 


40 


ACATGCCTTC 


AATGCTAAAT 


AAGTATTTTA 


ATAAGTTGTG 


GGTCTTTAGA 


AAATTATTCG 


3060 


GCAATTTAAT 


AT ACAG CAAT 


ACGTTATATG 


AAAATAATAA 


TGCACTTGAA 


ATTATTGCTG 


3120 




GAAAAATGTC 


ACTTGTTATT 


GCGGACATCA 


TGGATGAAAC 


GACAACCAAA 


GATAAGGCAT 


3180 


45 




TTAAATAAAA 


TCAACATACA 


AATCGTTTTA 


nTGGAGGTT 


ATAGTATGAA 


3240 


GTTAACAGTA 


GTTGGCTTAG 


GTTATATTGG 


TTTACCAACA 


TCAATTATGT 


TTGCAAAACA 


3300 




TGGcGTCGAT 


GTGCTTGGTG 


TTGATATTAA 


TCAGCAAACG 


ATTGATAAGT 


TACAAAGTGG 


3360 


SO 


TCAAATTAGT 


ATTGAAGAAC 


CTGGATTACA 


AGAGGTTTAT 


GAAGAGGTAC 


TGTCATCGGG 


3420 




AAAATTGAAG 


GTATCTACAA 


CGCCAGATGC 


AT CTGATGTT 


TTTATCATTG 


CCGTTCCGAC 


3480 



55 



376 



EP0 786 519 A2 



TAGTATTTTA TCATTTTTAG AAAAAGGAAA TACCATT ATT GTAGAGTCGA CAATTGCGCC 3600 

TAAAACGATG GATGATTTTG TAAAACCAGT CATTGAAAAT TTAGGGTTTA CAATAGGTGA 3660 

5 AGATATTTAT TTAGTGCATT GTCCAGAACG TGTACTGCCA GGAAAAATTT TAGAAGAATT 3720 

AGTTCATAAC AATCGTATCA TTGGCGGTGT GACTGAAGCT TGTATTGAAG CGGGTAAACG 3780 

TGTCTATCGC ACATTCGTTC AGGGAGAAAT GATTGAAACA GATGCACGTA CTGCTGAAAT 3840 

10 

GAGTAAGCTA ATGGAAAACA CATATAGAGA CGTGAACATT GCTTTAGCTA ATGAATTAAC 3900 

AAAAATTTGC AATAACTTAA ATATTAATGT ATTAGATGTG ATTGAAATGG CAAACAAACA 3960 

TCCGCGTGTT AACATCCATC AGCCTGGTCC AGGTGTAGGC GGTCATTGTT TAGCTGTTGA 4020 

15 

TCCGTACTTT ATTATTGCTA AAGACCCTGA AAATGCAAAG TTAATTCAAA CTGGACGTGA 4080 

AATTAATAAT TCAATGCCGG CCTATGTTGT TGATACAACG AAGCAAATCA TCAAAGTGTT 4140 

2Q GAGCGGGAAT AAAGTCACAG TATTTGGTTT AACTTATAAA GGTGATGTTG ATGATATAAG 4200 

AGAATCACCA GCATTTGATA TTTATGAGCT ATTAAATCAA GAACCAGACA TAGAAGTATG 4260 

TGCTTATGAT CCACATGTTG AATTAGATTT TGTGGAACAT GATATGTCAC ATGCTGTCAA 4320 

25 AGACGCATCG CTAGTATTGA TTTTAAGTGA CCACTCAGAA TTTAAAAATT TATCGGACAG 4380 

TCATTTTGAT AAAATGAAGC ATAAAGTGAT TTTTGATACA AAAAATGTTG TGAAATCATC 444 0 

ATTTGAAGAT GTATCGTATT ATAATTATGG CAAT AT ATTT ' AATTTTATCG ACAAATAAAA 4500 

30 TGTGTCAAAC TAGGGCATAC ATGATTAAGG AAAGATAAGC TGTCATGTGT TTGAACTTCA 4560 

GAGAGGATAA TGTTATGAAA AAAATTATGG TTATTTTCGG TACGAGACCC GAAGCAATAA 4 620 

. AAATGGCACC ATTAGTAAAA GAAATTGATC ATAATGGGAA CTTTGAAGCG AACATTGTGA 4680- 

or 

TTACAGCACA ACATAGAGAT ATGTTAGATA GTGTGTTAAG TATATTTGAT. ATTCAAGCTG 4740 

ATCATGATTT AAATATTATG CAAGATCAAC AAACATTAGC AGGCCTTACG GCGAATGCAC . 4800 

TTGCTAAACT TGATAGCATC ATTAATGAGG AACAACCGGA TATGATTTTA GTACATGGTG 4860 

40 

ATACTACAAC GACTTTTGTA GGAAGTTTGG CAGCATTTTA TCATCAAATT CCGGTCGGAC 4 920 

ATGTAGAAGC TGGACTTCGA ACACATCAGA AATACTCACC ATTTCCTGAA GAGTTAAATC 4980 

GAGTCATGGT AAGTAATATT GCTGAATTGA ATTTTGCGCC AACAGTAATT GCAGCTAAAA 504 0 

45 

ATTTACTTTT TGAAAACAAA GACAAAGAGC GTATCTTTAT TACTGGAAAT ACAGTTATTG 5100 

ACGCATTGTC AACAACAGTT CAAAATGATT TTGTTTCAAC GATTATTAAT AAACATAAAG 51S0 

so GCAAGAAAGT TGTTTTACTA ACAGCGCATC GTCGTGAAAA TATTGGGGAA CCGATGCATC 5220 

AGATTTTTAA AGCAGTAAGA GATTTGGCAG ATGAATATAA AGATGTTGTC TTCATTTATC 5280 
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15 



20 



GGATTGAATT AATTGAGCCA TTAGATGCGA TTGAGTTCCA TAATTTTACA AATCAATCGT 5400 

ACCTCGTGCT GACAGATTCT GGTGGTATTC AAGAGGAGGC TCCTACATTT GGAAAACCTG 5460 

TGTTGGTATT AAGGAATCAT ACAGAGCGTC CCGAAGGCGT TGAGGCGGGA ACATCGAGAG 5520 

TAATTGGCAC AGATTATGAC AATATTGTTC GAAATGTGAA ACAATTGATT GAGGATGATG 5580 

AAGCGTATCA ACGTATGAGT CAAGCGAATA ATCCATATGG TGATGGACAA GCATCACGAC 5640 

GTATTTGTGA AGCAATAGAA TATTATTTTG GATTGCGCAC AGACAAGCCG GATGAATTCG 5700 

TACCTTTACG TCACAAATAA TAAAAAAGCC CTAATCATGA AGTTGGTTTA GACAACCAGG 5760 

GGTGACTAGG GGTTTTTAAT ATATTTATTT TTGATAGTGG TAGCCAATAT CATATTTGAA 5820 

TACTTTATTT GATAATATTG GACTTTGCTG TCCATCGTCA TCACTTTTTA AACGTACATT 5880 
TTTATGAGCT TCTTTAAATA CATCGGAATT CAACGAATTA TTAAAGCTAT CTTCAGATTC ■ < 5940 

CCAAATAGTT AAGATTTTAA CTTCGTCTGT ATCCTCGGTA TTTAATGTTT TAGTGACAAA 6000 
CATTTGTTGG AAGCCTTCAA TAGTTTCAAT ACCTTGTCTA TTGTAAAAAC GTTCAATCGT * 6060 

TTCTTCCGCA CTGCCTTTTT GTAATTGTAA TCTATTTTCT GCCATAAACA TGGGCAATCA 6120 

25 CTCCTCTATT TTATGATTTG ATTTGGGTAA TGTTTTTACA AATGTAAAGA GTACAGCGGT 6180 

TTGTATGATA ACCATTATGA TTAATCGTAC ACGGACTGCA AGAACATCCA CCATATAAAT 6240 

TGAAAAACCT ATTACAATGT ATAAGCTAAT TAAAATTTTA ATTTTCTGTT GTAGCGTGTA 6300 

30 GCCTCGATGT AAATAAAAGT TTTCTACATA TTCTTTATAA ATTTTTTGAT TAATAAGCCA 6360 

ATTGTAAAAG CGATCTGAAC TTCGAGCAAA GCAAAAAACT GGTACGAGTA AAAAAGGGGT 6420 , 

CGTTGGCAGT AAAGGTAATA CGGCACCTGC AATACCAAGC GCTGTAAATA TTAAGCCAAt 6480 
GACGATTAAA ATAAGTGGGA TTGAAAAAAC TCCATTCTAG TACTAATGCG CATGTAATAT . 6540 

TGTTTTAGTA ATATAACTCA TGCT AAATAT AATGTGTATG ATAAGTGCAA TGACTCAGTA 6600 

AAATGAAACG ATGTTGAATT ATCCTTGTCA CATTAACGCA TTTTAAGCGC GACTTTGATA 6660 
ACAACCAAAC TATTTAATGA GAATTATTCT CAAGTATTAT AGTT AT ATT A TGTGTTTTAT 1 6720 

TTTTGAAAAG TGCAATATGT TTTCGAAAAT AAGATTATTT TTATGTGCAA AAACGACGCA 6780 

AAAGTTTTAA AAATGAGACT TCTGTGAGCT GATTATTTTA TAAAATGTAA ACGCTTACTA 6840 

TATAATGTGA ATCATATCGT TTAAAAGCAT TATTAAATAT GATG CTAAG A GATTTATATT 6900 

ATAGCCAATA AACAAAGGAG AGATAATATG GCAGTAAACG TTCGAGATTA TATTGCAGAG 6960 

AATTATGGTT TATTTATCAA TGGGGAATTT GTTAAAGGTA GCAGTGAGGA AACAATCGAA 7020 

GTGACTAATC CAGCAACTGG AGAAACAGTA TCACATATTA CAAGAGCAAA AGATAAAGAT 70 BO 
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TCAGAACGTG CACAAATGTT GCGTGATATT GGTGATAAAT TAATGGCACA AAAAGATAAA 7200 

ATTGCAATGA TTGAAACATT AAATAATGGT AAACCGATTC GTGAGACAAC AGCAATTGAT 726 0 

5 ATTCCATTTG CTGCAAGACA TTTCCATTAT TTCGCAAGTG TTATTGAAAC AGAAGAAGGT 7320 

ACAGTGAATG ATATCGATAA AGACACAATG AGTATCGTAC GACATG AG C C GATTGGCGTC 73 80 

GTAGGTGCTG TTGTTGCTTG GAACTTCCCA ATGCTATTAG CTGCATGGAA GATTGCGCCA 744 0 

10 

gCCATTGCTG CAGGTAATAC AATTGTGATT CAACCTTCGT CTTCAACACC ATTAAGTTTA 7500 

TTGGAAGTTG CTAAAATTTT CCAAGAGGTA TTACCTAAAG GTGTTGTCAA TATACTAACG 7560 

GGTAAAGGTT CAGAATCAGG TAATGCAATT TTCAATCATG ATGGTGTAGA TAAATTATCA 7620 

TTTACGGGCT CAACTGATGT AGGTTATCAA GTTGC CGAAG CTGCAGCAAA ACATCTAGTA 7680 

CCCGCTACAT TAGAGCTTGG TGGTAAAAGC GCCAATATCA TATTAGATGA TGCTAATTTA 7740 

20 GACCTTGCAG TTGAAGGTAT TCAGTTAGGT ATTTTATTCA ACCAAGGTGA AGTATGTAGT 7800 

GCAGGTTCTC GATTATTAGT TCATGAAAAA ATTTATGATC AATTGGTGCG ACGTTTACAA 7860 

GAGGCATTTT CAAATATTAA AGTTGGAAAT CCACAAGATG AAGCTACACA AATGGGTAGT 7920' 

25 CAAACTGGTA AGGATCAATT AGATAAAATT CAATCATATA TTGATGCAGC AAAAGAATCA 7980 

- GATGCACAAA TTTTAG CAGG CGGTCATCGC TTAACTGAAA ATGGATTAGA TAAAGGGTTC 804 0 

TTCTTTGAGC CGACATTAAT TGctGTGCCA GACAATCATC ACAAATT AG C ACAAGAAGAA 8100 

30 ATATTTC3GAC CAGTGTTAAC AGTGATTAAA GTGAAGGAGG ATCAAGAAGC AATTGATATA 8160 

GCTAATGATT CTGAGTATGG TTT AG CAGG C GGTGTATTTT CTCAAAATAT CACACGTGCA 8220 

TTAAATATTG CTAAAGCTGT ACGTACAGGA CGTATTTGGA TTAACACTTA CAACCAAGTA ' 828 0 

35 

CCAGAAGGCG CACCATTTGG TGGTTATAAA AAATCAGGTA TCGGTCGAGA AACTTAT AAA - 8340 

GGTGCGTTAA GTAACTATCA ACAAGTTAAA AATATTTATA TTGATACAAG CAATGCTTTA 84 00 

AAAGGTTTGT ACTAGAATAA ATATCGTTTC TGAAGCGTGT TTGTAGGTCA GTCTAGCGGT 8460 

40 

AAGTCTTAAC ATTTAACGGC GTTGTTTAGA TTTTAAGCAA AACAAAATAT ATAGGAACAC 852 0 

GTATCATGAT ATTAGGATAT AATGACTAAA ATAATAGCAG TAGGATGGTT TTTAATTGCA 8580 

AATCATCTTA CTGCTGTTTT TAATTATGCT AATTTG CG AT GCGGCTATTA TAAGGACAGA 864 0 

45 

GTTGTTTATT AATTATGGTG ATTTAGAAAT ATGAAGTTCA ATATGCAAAG TCATCGTTTG 870 0 

TTTTAATATG CGGAACAATC ATTAAAGTTA TTGCGATTTT TTGAACTTAA TGAAACTAAA 876 0 

SO CAATAAATTT GAGATACTTT TTTGTCATTT TTATGTAACT AACACAATAA TCTCGTACAT 882 0 

TATTAAAATT TTCTATATGA TAGGAATAAA GCAAAGCGCG AGTGTGCTGT AAAAGTTTTC 8 880 
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GATGATGTAT AAATCATGGT TAATTACGGA 
GAATTATTTT TAAAAGCGAC AATATTAAAT 
5 ATGAATGGGA AAAAGGCGAA TACGATAAAC 

CAAAAAATTC AACAAAGTTC TAAAAAGACG 
TTTACAGTGA TTGAATTTGT CGGAGGTTTA 

10 

TCATTTCATA TGCTTAGTGA TGTATTAGCA 
GCAAGTAAAA AGCCGACTGC ACGATACACA 
GCATTTTTAA ATGGTTTAGC ATTAATTGTA 
GTACGTATTA TTTATCCGCA ACCAATTGAA 
GGTTTACTGG TCAATATTAT TTTGACTGTT 
AATATCAATA TTCAAAGTGC ATT ATGG CAT 

20 

GTCATCGTTG CAGTTGTATT GATTTACTTT 
AGTATTGTAA TTTCACTCAT CATTTTACGT 

25 tTAATTTTAA TGGAAAGTGT GCCTCAACAT 

AAAAACATAG ATGGCATATT AGATGTACAT 
CATTATTCAT TAAGTG CCCA TGTTGTGTTA 

30 GCGATTGATC AAGTAT CATC ATTGTTGAAA 

CAAATTGAAA ACTTGCAATT GAATCCATTA 
ATAAAACATT GTAGCGCCTA AAACATTAAT 

35 CTTATGTTCC ATCATTTAAA TGATTTTCGT 

CGACATCTTT AGGTTTCAAA ATATGAATAT 

CTATGATGTA CCTTTGACCG GCCATTGTTT 

40 

* TTGCTACGAC AGATTCTTTA TCCATAATGA 

TACCCTAACA TGATTTTTAT ACTCTTTGAA 

TTAAAAAAAT ATCTTAATAT CCTTGTAATC 

45 

CATtGTTATA GGAGGTCTTA TTAATGACAT 
TTGCATCAAC GAAAGAAGAA CTAGAAGCAA 
CAACATTAAT TGAAGTACAA GCTACTGAAA 

SO 

CAAATGACGA aGCAGAAGCT AAACAATTTT 



AGCATTAATA TTAACCTGAG AAGCTATAAA 9000 

ACGACGCATT TATTTAGGAG TGGCAAACGT 9060 

AGATACAAAT ATTTTCATCA TGTCAATCAT 9120 

CTGTGGGCAT CACTAATCAT CACATTGTTA 9180 

GTATCTAATt CATTGGCATT ACTGTCAGAT 9240 

CTTGGTTTAT CTATGTTGGC CATTTATTTT 9300 

TTTGGATATT TAAGATTTGA GAT ATT AG CT 9360 

ATTTCAATCT GGATTTTATA TGAAGCTATT 9420 

AGTGGCATTA TGTTTATGAT TGCTAGTATT 9480 

ATCCTTGTAA GGTCTTTAAA AGAAGAAGAC 9540 

TTCATGGGAG ACTTATTGAA CTCTATTGGT 9600 

ACAGGATGGC GCATCATCGA CCCAATCATT 9660 

GGTGGTTATA AAATTACGCG TAATGCgTGG 9720 

TTGGATACTG ATCAAATTAT GGGAGATATT 9780 

GAATTTCATT TGTGGAGTAT TACAACAGAG 984 0 

G AT AAAAAAT ATGAGGGTGA TGATTATCAA 9900 

GAAAAATATG .GCATTGCACA TTCAACGTTG 996 0 

GATGAGCCAT ACTTCGACAA ATTAACATAA 10020 

CTATGTCATA GGCGCACGTT TCGTTTTATA 10080 

CAATTTCTTT GATGCTATCT ACATCTAACA 1014 0 

GTTTTT GATC ATTTGTATGT AAAATGCGTT 10200 

CTACAGCAAT CTTTTTGTTT CT AG CT AAAC 10260 

TAGCCCCCTA TATATATGTT TATTTACTTA 10320 

AATATATTTT ACAGAATTTT ATCTAAATAT 10380 

CGATAAGAAT TATAGTAATA TTTTTTCAAC 10440 

TATTTTTATT AGAAGCTAAC AATCTTGATT 10500 

AGGCAGCATC ACTATCTACG AAGACAATTC 10560 

ATTTAACTCA TGGTTATTTT ATTGTGGAAG 10620 

TAACAGAAGC AGATATTAGT ATTCAATTAG 10680 
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TTGATTACCT TGTAACTTGG AACATTCCGG AAGGCATTAC GATGGATCAA TATTTAGCAC 10800 

GTAAAAAGAA AAATTCTGTT CATTATGAAG AAGTGCCAGA AGTTGAATTT AAACGCACAT 10860 

5 

ATGTATGTGA AGATATGTCT AAATGTATTT GTTTATACAA CGCACCTGAT GAAGAAGCGG 10920 

TACGTCGCGC GCGCAAAGCA GTTGATACAC CGATTGATGG CATCGAAAAA CTTTAATAAG 10980 

ACAACAAGTT GATGAGATAT ATGTATATAG GTTTGGCATG GATTTCGATT GCAGTTAATT 11040 

10 

AGAATAGCTC AATGCTATAA ATGTAAGTAG TTGATATGAA GAAACTAATG AACTAAATGC • 11100 

AAGTATTGTC TAAAACAATC ATTTTATTGA AATTTAGTAG AGCTGAAATT AATATAACGT 11160 

CGTTAATTGA ATAACGCTTA TGTTATAAGA GCACTCATAC CAAACCATAA TCATCTATAG 11220 

ATATAACAAT TCACGATATA AGGGCTGTGT TTGGCATAGC CCTTTAGATA TACACTTAAT 11280 

TCCTATTAAA ATAGTAGGGA TTAAAAGGGG GCTTGTCATG ATTAAAATTC AACAATTACA 11340 

2o ACATCACTTT GGATCACATA AAGTAATTCA TAACTTTAAT TTGGACATTA GCAAGGGAGA 11400 

AATAGTCACT TTCATAGGGA AAAGTGGTTG CGGAAAGTCT ACTTTACTCA ATATTATCGG 11460 

TGGATTTATT CATCCATCGT CTGGTCGTGT CATTATTGAT AACGAAATTA AACAACAGCC 11520 

25 ATCTCCAGAT TGTTTAATGC TATTTCAACA TCATAATTTG CTGCCATGGA AAACGATTAA 11580 

TGACAACATT AGGATTGGAT TACAACAGAA AATTAGTGAT GAAGAGATTA ACGCACAGCT 11640 

TAAATTAGTT GATTTAGAAG ACAGGGGAAA GCATTTTCCC GAGCAACTGT CCGGGGGTAT 11700 

30 GAAACAACGT GTGGCACTAT GTCGAGCGCA TGTGCATAAG CCTAAGGTTA TATTGATGGA 1176 0 

TGAGCCATTA GGTGCATTAG ATGCATTTAC ACGTTATAAA CTTCAGGATC AACTAGTGCA 11820 

aCTAAAACAT AAAACGCAAT CAACTATTAT TTTAGTGACG CATGACATTG ATGAAGCTAT 118 BO 

55 TTATCTTTCC G AC CGCATTG TTCTGTTAGG TGAAGGGTGC: AATATTATTT CTCAATATGA 11940 

AATTACAGCA TCACATCCAC GCAGTCGTAA TGATAGCCAC CTACTTAAGA TTCGTAATGA 12000 

AATTATGGAA ACATTTGCAT TGAATCATCA TCAAGTTGAA CCTGAATATT ATTTATAAGG 12060 

40 

AGTGAGTGAC GATGAAAAGG TTAAGCATAA TCGTCATCAT TGGAATCTTT ATAATTACAG 12120 

GATGTGATTG GCAAAGGACG TCTAAAGAAC GGTCTAAAAA TGCCCAAAAT CAGCAAGTGA 12180 

TTAAAATTGG ATATTTGCCG ATTACACATT CAGCTAATTT GATGATGACT AAAAAATTAT 1224 0 

45 

TATCACAATA CAATCATCCG AAATATAAAC TAGAATTAGT TAAATTCAAT AATTGGCCAG 12300 

ATTTAATGGA CGCATTAAAC AGTGGTCGTA TTGATGGTGC ATCAACTTTA ATAGAGCTAG 12360 

SQ CGATGAAATC AAAACAGAAG GGCTCAAATA TAAAGGCTGT GGCATTGGGC CATCATGAAG 12420 

GCAATGTCAT TATGGGACAA AAAGGTATGC ACTTAAATGA ATTTAATAAT AATGGCGATG 12480 
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GTAAACAATT AAAGATTAAA CCGGGGCATT TTAGCTATCA TGAAATGTCG CCAGCAGAAA 12600 

TGCCAGCCGC ATTGAGTGAA CACAGAATTA CAGGGTATTC TGTAGCCGAA CCATTCGGTG 12660 

CACTGGGTGA AAAGTTAGGC AAAGGTAAGA CTTTGAAACA TGGTGATGAC GTTATACCTG 12720 

ATGCGTATTG CTGTGTGCTA GTACTGAGAG GGGAATTGCT TGATCAACAC AAGGATGTAG 12780 

CGCAAgCATT TGTACAAGAT . TATAAAAAGT CTGGCTTTAA AATGAATGAT CGCAAGCAAA 1284 0 

GTGTAGACAT TATGACG CAT C A TTTTAAAC AAAGTCGTGA CGTTTTAACA CAGTCAGCGG 12900 

CATGGACATC CTATGGTGAT TTAACAATTA AGCCATCCGG CTATCAAGAA ATTACGACAT 12960 

TGGTAAAACA ACATCATTTG TTTAATCCAC CTGCATATGA TGACTTTGTT GAACCGTCAT 13020 

TGTATAAGGA GGCATCGCGT TCATGACACG TCCCACAAAT AACAAATTTA TATTACCTAT 13080 

TATCACATTT ATTATTTTCT TAGGCATTTG GGAAATGGTC ATTATTATTG GGCATTACCA 13140 

ACCTGTATTG TTACCGGGTC CTGCTCTTGT AGGAAAAAGT ATATGGTCTT TCATTGTTAC 13200 

TGGAGAAATT TTCCAACATT TAGCAATTAG TTTATGGAGA TTTGTAGCGG GCTTTGTTGT 13260 

CGCATTGTTG GTTGCTATTC CATTGGGCTT CTTGCTTGGA AGGAATCGTT GGGTATACAA 13 320 

CGCTATCGAA CCGCTATTTC AATTGATTAG GCCGATATCT CCGATAGCAT GGGCACCATT , 13380 

TGTTGTTCTA TGGTTTGGTA TTGGTAGTTT GCCAGCGATT GCGATTATTT TTATCGCTGC 13440 

TTTTTTCCCA ATTGTGTTCA ATACTATTAA AGGCGTTAGA GACATTGAAC CTCAATATTT .13500 

AAAAATAGCA GCAAATTTAA ATTTAACTGG GTGGTCATTG TATCGCAATA TATTATTTCC . : 13 560 

CGGGGCATTT AAACAAATCA TGGCTGGGAT ACATATGGCG GTAGGAACAA GTTGGATATT 13 620 

TTTAGTTTCT GGTGAAATGA TTGGTG CACA ATCGGGATTA GGTTTTTTAA TCGTTGATGC 136 BO 

ACGAAATATG TTGAACTTAG AAGATGTTTT AGCAG CAATA TTCTTTATCG GATTATTTGG 13740 

TTTI3VTTATT GATCGATTCA TTAGTTATAT TGAGCAGTTT ATACTTAGAA GATTTGGTGA 13 800 

ATAAGGAGAG ATGATGATGA GTTTAGAAAC GCTTATCAAA GAACAATTAG ATCCTCATTT 13 860 

AGTAGAAGTT GATGAAGGGA CGTATTATCC GAGAACATTT ATTCAG<ZAAT TATTTGTAGA 13920 

TGGTTATTTC GGTGAGGCGG CATTGAGAAA AAATGCTGAA GTAATCGAAG CTGTATCGCA 13 9B0 

GTCTTGTTTG ACAACAGGAT TTTGTTTATG GTGCCAATTA GCTTTTTCAA CGTATTTAGA 14040 

AAATGCCACG CAGCCACATT TAAATAATGA CTTACAACAG CAATTGTTAT CTGGAGAAAT 14100 

ATTAGGTGCT AGCGGATTGT CTAATCCGAT GAAGT CATTT AATGATTTAG AAAAGTTGAA 14160 

CCTTGAACAC ACTTATGTTG ATGGACAATT GGTTGTCAGT GGACGTATGC CAGCTGTAAG 14220 

TAATATTCAA GAAGACCATT ATTTTGGTGC GATTTCGAAA CATGAATCAT CAGATGAATT 14280 
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TTTAGGAGTC AACGGGTCAG CAACGT AT CA 
ATCACAAATT ATCACGCATG ATGCGAAGCA 
5 TGCTTACCAA ATTCCAATAG GATTAGGCTC 

ATTTTCAAAT GTGCAAAACG GAATAAATCA 
AAAACGTTAT CGTCAACTTA GAGAGGAATA 

10 

TTCACATTTA AATGAATTAA TATCATTGAA 
AAATCAAGCT TCTGTTGTCA ATGGTGGTTC 
TCGCAAGTTA AAAGAAGGAT TCTTCTTCGC 

15 

TAAACTTGAA GCAGAGTTGA AGGGGTAAGT 
TTGTTGAAAC ATTTTTTAAA ATAATATAAA 

2Q GTTATATCCT TTTAACTAGG AAAATATACA 

AAAAGTGTTA ATAAGGTGTA TAATGAAAAT 
GAAGGTGAAT ACTATAGATA CGCATACTAA 

25 TTATCGTAAA GAATACGTGG GTAAAGGACC 

TTGGGCGATT GCACATATGA CAGGTGTTTT 
CAAACGCAAT GAATCGATGC TCCATTATAC 

30 AGAAATAGAT GTAAATG AGA TGGAAAGTCT 

AGATATTGAT TTGAATGATG ATGAAGT CAT 
ATAAGTGTTG CTGGTGTAAG GTACACGGTG 

35 ATAATTCAAG GGGGTGGTAT GTCAAACGGT 

CAAdATGCAA CACGTACTTT AAGGAAGTCA 

AAAT CGTAGC ATTATTTCCA GAAGCAGTAG 

40 . . 

AAAAAGCATT AGGATTAAAA ACATTTTTAG 

CAGATAATGG TGAAGACTTA GATAAACATT 

CATTTTATCC TG CAT AT ATG ACTCGTGAAC 

45 

CAATTACAGC AGGTGTAGGA TCTGACCATG 
TTGGTGTCGT TGAAGTTACA GGAAGTAATA 
ATTTATTAAT ACTTCTTAGA AACTATGAAG 

SO 

GGAACTTGTC TCAAGTAGGT AATCATGCGC 

f 

55 



AATCACATTG 


AATCAAGTCG 


TAGTGCCACA 


14400 


GTTTGCGGCA 


ACTATTCGCC 


CGCAATTTAT 


14460 


AATTAAAAGT 


TCTTTAGAGT 


TAATTGATGC 


14520 


ATATTTAGAG 


TATGATGTTG 


AAGCTTTTAA 


14580 


TTATG CAATA 


TTAGATGACG 


GTAACTTAAC 


14640 


GAAGGACATG 


GG CT ATT TAT 


TGTTAGATGT 


14700 


TAGAGCGTAC 


ACACCATATT 


CGCCAGAAGT 


14760 


AGCATTGACA 


CCGACATTAA 


GACATTTAGG 


14820 


GTGATAAGCT 


GATTTTTTGT 


TTAGATGCGT 


14880 


TCTTAGTTTA 


TAAACATTTT 


CTGTTAATTT 


14940 


TTTCGTAATA 


ATAATAATCG 


TTATCATTGA 


15000 


GTGAACAATT 


AATGAACTTC 


TTATTTTAAA 


15060 


AGAACAACAA 


TTCTCGAATC 


TAGTAAGATC 


15120 


CAATAGTATT 


CGAGTGTCGT 


TTAAAGATAA 


15180 


GAGTAAAGTT 


GAGAGTTTTT 


AC CT AAACGA 


15240 


ACGCACAGAG 


AAGATTAAAC AGATGTATAA 


15300 


TGTAGGCGCT 


AAGTTTGTAA 


AATTATTTAC 


15360 


TTCAATATTT 


GTTTTCGATA 


AGTCAATAGA 


15420 


CTGTTTGCTA 


ACTTCGCTTT 


GAATTTAACA 


15480 


GCCGTTTTTT 


TGTCATATTT 


TTAAAACAAG 


15540 


AAATTTATCA 


TTTAGGAG AG 


ATGGATATGA 


15600 


AAGGTCAAGA 


* AAATCAATTA 


CTTAATACTA 


15660 


AGGAAAGAGG 


ACATGAGTTC 


ATTATATTAG 


15720 


TACCAGATAT 


GGATGTGATT 


ATTAGTGCGC 


15780 


GTATTGAAAA 


AGCACCGAAC 


TTGAAATTAG 


15840 


TAGATTTAGC 


GGCAG CAAGT 


GAACACAATA 


15900 


CAGTTAGTGT 


GGCAGAACAT 


GCGGTTATGG 


15960 


AAGGTCATCG 


TCAATCAGTA 


GAAGGTGAAT 


16020 


ATGAATTACA 


ACACAAAACA 


ATTGGTATTT 


16080 
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TACAACACTA TGATCCAATC AATCAACAAG ACCATAAATT GTCTAAATTT GTAAGCTTTG 16200 

ATGAACTTGT TTCAACAAGT GATGCGATTA CAATTCATGC ACCATTAACA CCAGAAACTG 16260 

5 ATAACTTATT TGATAAAGAT GTTTTAAGTC GTATGAAAAA ACACAG TTAT TTAGTGAATA 16320 

CTGCACGTGG TAAAATTGTA AATCGCGATG CGTTAGTTGA AGCGTTAgCA TCCGAGCATT 16380 

T ACAAGG AT A TGCTGGTGAT GTTTGGTATC CaCAACCtGC ACCTGCTGAT CATCCATGGA 1644 0 

10 J 

GAACAATGCC TAGAAATGCT ATGACGGTTC ACTATTCAGG TATGACTTTA GAAGCACAAA 16500 

AACGTATTGA AGATGGAGTT AAAGATATTT TAG AG CGTTT CTTCAATCAT GAACCTTTCC 16560 

AAGATAAAGA TATTATTGTT GCAAGTGGTC GTATTGCTAG TAAAAGTTAT ACAGCTAAAT 16620 

AGAATAAGGA TGCTGGGCTA GCGATTAACG CTTTCAATTT TATATAAATG AATCATATAA 16680 

GCACTACTGC TGTTGTAAAG ATGGCAGTAG TTTTTTTATG ATTACATCTA AGTATAGTCA 1674 0 

CGGCTATGTT AGGACAATGA TTTAACATTT ACGCACATAT GTGTTCACTT ACGCAATTAT 16800 

20 

TGAnAAATnT CATTCATGTG GnAATC 16826 

(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 4012 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 

TTCAATGAGA GTAGTGGGCT GATGTTTAGC GATATCGCGT AAGATTAACC ATTGGCCATA 60 

35 ATATATATTG TGTTTTTCTA AAATCGGCTC GGCTAATTTT AAATAGGGGC GATATATTGT 120 

TATJAAACTA TTGAAAAATT CTTGTGATAG CATAGTGACA TCTCCTAAGA CAAAATAGTT 18 0 

AGCTTAGCTA mCCTTTTTAC AACAATAGTA ATTATAAAAC GGGAGCAATT AGAAATCAAT 240 

40 ATATAATTAT TAAGAGCAAA AATAATTATA CTTTGTTAAA ATAAGCGTAA TTACATGTAA 300 

ATAGGGGGAT ACTAATGATA TTGAAATTTG a TCACATCAT TCATTATATA GATCAGTTAG 360 

ATCGGTTTAG TTTTCCAGGA GATGTTATAA AATTACATTC AGGTGGGTAT CATCATAAAT 4 20 

45 

ATGGAACATT CAATAAATTA GGTTATATCA ATGAAAATTA TATTGAGCTA CTAGATGTAG , 480 

AAAATAATGA AAAGTTGAAA AAGATGGCAA AAACGATAGA mGGCGGAGTC G CTTTTGCT A 540 

CTCAAATTGT TCAAGAGAAG TATGAGCAAG GCTTTAAAAA TATTTGTTTG CGTACAAATG 600 

50 

ATATAGAGGC AGTTAAAAAT AAACTACAAA GTGAGCAGGT TGAAGTAGTA GGGCCGATTC 660 
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ATCAGGATGA TGATGAAATT AAGCCACCAT TTTTTATTCA ATGGGAAGAA AGTGATTCCA . 780 

TGCGTACTAA AAAATTGCAA AAATATTTTC AAAAACAATT TTCAATTGAA ACTGTTATTG 840 

5 TGAAAAGTAA AAACCGATCA CAAACAGTAT CGAATTGGTT GAAATGGTTT GATATGGACA 90,0 

TTGTAGAAGA GAATGACCAT TACACAGATT TGATTTTAAA AAATGATGAT ATTTATTTTA 960 

GAATTGAAGA TGGTAAAGTT TCAAAATATC ATTCGGTTAT CATAAAAGAC GCACAAGCAA 1020 

10 CTTCACCATA TTCA A TTTTT ATCAGAGGTG CTATTTATCG CTTTGAACCA TTAGTATAAA 1080 

TATACGTAAG TGCTATGAGC GAGAATGCCC ATATGAATAA TGACAAGCAC AATGGAAAGA 1140 

ATCGTTAATA TATTATTTAA TCGTGATGAC TTAATTAAAA TGAAAAAGAT TGATAATATA 1200 

15 

AATGTGAAAA AGATAAGTAT AACCCGTAAA CTAAAGTAAT TCACGGTGAG AGGTTGACTC 1260 

AATGTCATAA TGATTGCAAC GATGTTCATA ATTATAAATA GACTTAAAAT AATTGTTCTC 1320 

AT AT CAAACA CCTCATTGTT AGATTATTGA CATTATAACA GGGGTAATTG TATATGAACA 13 80 

20 

TTAATGTGGT TGCTTGAGGA AAAATTTATT CATTGAAGTC AAGTTGGTTC ATTTTAGAAA - 144 0 
TGAATATCGT GTTAGATGAT GAAAGTATAT TGAAGTATAG GTAACTAGTT GAAAAGTATT - 1500 

2s AATTGTACGA TAACATTAAA TTTAACACGA AACATAGATA TAAAATGATT CACAATTAAA . 1560 

ATGGGTAAAT TTGAACTTGC TAAACTATTA ATTGGAG CAT GGACATTTCA AAAATAAGAG 1620 

TTCAAATCTT ACACAAGCTC TGAATCGACA CTATAAGATA GAAACTGTAT AATTAAAGGT 1680 

30 ATTGTTAAAT AGAAGGAGAT ATCATAAATC ATGGAAAAGA TGCATATCAC TAATCAGGAA 174 0 

CATGACGCAT TTGTTAAATC CCACCCAAAT GGAGATTTAT TACAATTAAC GAAATGGGCA 1800 

GAAACAAAGA AATTAACTGG ATGGTACGCG CGAAGAATCG CTGTAGGTCG TGACGGTGAA I860. 

35 GTTCAGGGTG TTGCGCAGTT ACTTTTTAAA AAAGTACCTA AATTACCTTA TACGCTATGT 1920 

TATMTTCGC GTGGTTTTGT TGTTGATTAT AGTAATAAAG AAGCGTTAAA TGCATTGTTA 1980 

GACAGTGCAA AAGAAATTGC TAAAGCTGAG AAAGCGTATG CAATTAAAAT CGATCCTGAT 204 0 

40 GTTGAAGTTG ATAAAGGTAC AGATGCTTTG CAAAATTTGA AAGCGCTTGG TTTTAAACAT 2100 

AAAGGATTTA AAGAAGGTTT ATCAAAAGAC TACATCCAAC CACGTATGAC TATGATTACA 2160 

CCAATTGATA AAAATGATGA TGAGTTATTA AATAGTTTTG AACGCCG AAA TCGTTCAAAA 2220 

45 

GTGCGCTTGG CTTTAAAGCG AGGTACGACA GTAGAACGAT CTGATAGAGA AGGTTTAAAA 2280 

V - 

ACATTTGCTG AGTTAATGAA AATCACTGGG GAACGCGATG GCTTCTTAAC GCGTGATATT 2340 

AGTTACTTTG AAAATATTTA TGATGCGTTG CATGAAGATG GAGATGCTGA ACTATTTTTA 2400 

50 

GTAAAGTTGG ATCCAAAAGA AAATATAGCG AAAGTAAATC AAGAATTGAA TGAACTTCAT 24 60 
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CAAAATATGA TTAATGATGC GCAAAATAAA ATTGCTAAAA ATGAAGATTT AAAACGAGAC 25 80 

CTAGAAGCTT TAGAAAAGGA ACATCCTGAA GGTATTTATC TTTCTGGTGC ACTATTAATG 264 0 

5 

TTTGCTGGCT CAAAATCATA TTACTTATAT GGTGCGTCTT CTAATGAATT TAGAGATTTT 2700 

TTACCAAATC ATCATATGCA GTATAGGATG ATGAAGTATG CACGTGAACA TGGTGCAACA 2760 

ACTTACGATT TCGGTGGTAG AGATAATGAT CCAGATAAAG ACTCAGAACA TTATGGATTA 2820 

10 

TGGGCATTTA AAAAAGTGTG GGGAACATAC TTAAGTGAAA AGATTGGTGA ATTTGATTAT 2880 

GTATTGAATC AGCCATTGTA CCAATTAATT GAG CAAGTTA AACCGCGTTT AACAAAAGCT 2940 

AAAATTAAAA TATCTCGTAA ATTAAAACGA AAATAGATTA ACGACTGAAA TCTGAACGCT 3000 

15 

CATAAGACTG TCATTTGGGT TCAGATTTTT TTACACAATA TAGAATGGTT GAGTAAAATA 3060 

TTTTTGAATA TAGTGAAAGA GGGGGAAGTA CTGTGATAAA AAAGCTATTA CAATTTTCTT 3120 

2Q TAGGGAATAA GTTTGCTATC TTTTTAATGG TTGTTTTAGT TGTCTTGGGC GGTGTATATG 3180 

CGAGTGCTAA ATTGAAATTA GAATTACTAC CAAATGTACA AAATCCAGTT ATTTCAGTTA 3240 

CAACAACAAT GCCGGGTGCA ACGCCACAAA GTACCCAAGA TGAAATAAGT AGTAAAATTG 3300 

25 ACAATCAAGT AAGAT CATTG GCATATGTGA AAAATGTTAA AACGCAATCC ATACAAAATG 3360 

CTTCAATTGT AACAGTTGAA TATGAAAATA ATACAGATAT GGATAAAGCA GAAGAACAGC 3420 

TTAAAAAAGA AATCGATAAA ATTAAATTTA AAGATGAAGT TGGTCAACCA GAATTAAGAC - 3480 

30 GTAATTCGAT GGATGCTTTT CCGGTTTTAG CATATTCATT TTCAAATAAA GAGAATGACT 3540 

TGAAAAAAGT AACGAAAGTA CTGAATGAAC AATTAATACC AAAATTGCAA AGGGTAGATG 3600 

GTGTGCAAAA TGCGCAATTA AATGGGCAGA GGAACCGTGA AATCACCCTT AAATTTAAGC 3660 

35 AAAATGAACT TGAAAAATAT GGGTTGACTG CTGATGATGT AGAAAACTAT CTAAAAACGG -3720 

CAACAAGAAC AACGCCACTT GGATTGTTCC AATTTGGTGA TAAAGATAAT CAATTGTTGT 37 80 

TGATGGTCAA TATCAATCTG TTGATGCTTT TAAAAACATA AAT ATTC CAT TAACGTGGCA 3840 

40 

GGAGGACCAA GGGCATCTCA TCCCAAAGTG ACCATAAACC AAATTCAGCC ATGTCAGACG 3900 

TTATCAGGCA TCACCACAGC AAATTCAAAG CGTCAGCnCC AATATATAGT GGATGCCGCA 3960 

nGAACTAGGG GTTTAGCGnT ATCAGTGGTG TGGCGACTCT ATTCTAAACG AT 4012 

45 

(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7778 base pairs 

(B) TYPE: nucleic acid 
60 (C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: ... * 

CAATATAGGT CGCCGAGTTT CAACTa CATC AACTGGTTCA GTTACATTAG ATAATGCGCT 60 

5 ...... ' 

AGGTGTAGGT GGCTATCCTA AAGGACGAAT TATTGAAATT TATGGTCCTG AAAGTTCTGG 120 

TAAGACAACA GTAGCGCTTC ACGCTATTGC TGAAGTACAA AGTAATGGCG GGGTGGCAGC 180 

ATTTATCGAT GCTGAACATG CTTTAGATCC AGAATATGCT CAAGCATTAG GCGTAGATAT 24 0 

10 

CGATAATTTA TATTTATCGC AACCGGATCA TGGTGAACAA GGTCTTGAAA TCGCCGAAGC 300 

ATTTGTTAGA AGTGGTGCAG TTGATATTGT AGTTGTAGAC TCAGTTGCTG CTTTAACACC 360 

TAAAGCTGAA ATTGAAGGAG AAATGGGAGA CACTCACGTT GGTTTACAAG CTCGTTTAAT '420 

15 

GTCACAAGCG TTACGTAAAC TTTCAGGTGC TATTTCTAAA TCAAATACAA CTGCTATTTT 480 

CATCAACCAA ATTCGTGAAA AAGTTGGTGT TATGTTCGGT AATCCAGAGA CTACACCAGG 54 0 

2Q TGGACGTGCA TTAAAATTCT ATAGTTCAGT AAGACTAGAA GTACGTCGTG CAGAACAGCT 600 

TAAACAAGGA CAAG AAATTG TAGGTAATAG AACTAAAATT AAAGTCGTTA AAAATAAAGT 560 

GGCACCACCA TTTAGAGTAG CTGAAGTTGA TATTATGTAT GGACAAGGTA TTTCTAAAGA 720 

25 GGGTGAACTT ATTGATTTAG GTGTTGAAAA CGACATCGTT GaTAAATCAG GAGCATGGTA 780 

TTCTTACAAT GGCGAACGAA TGGGTCAAGG TAAGGAAAAT GTTAAAATGT ACTTGAAAGA 840 
AAATCCACAA ATTAAAGAAG AAATTG AT CG TAAATTGAGA GAAAAATTAG GTATATCTGA ' - 900 

30 TGGTGATGTT GAAGAAACAG AAGATGCACC AAAGTCATTA TTTGACGAAG . AATAGTACAC - 960 
AAATTTATAT CTATAGTTAA ACTTAGCAAA TATCCTTATA GGATTGATTG AAAGTGATAT . 1020 

TCATCTCATA AAGCTAGAAT AATATCTAAC TTTATGGGAT ACACTACAAA TCGAGACTAT 10 80 

55 AAGGTTTTTT ATTTTATTTA TTATTACATT ATCAATAGTT TTATAATCGA GCTTCAAAAC. . * .114 0 

TTTAGAAAAT AGTAGAAATA GCATTCAATA TAGTGCAAAA GTGCAAATTG ATAACTTGAC 1200 

ACTTATCTCC TATAAACCGT ACAATTAATT TGTATGATTT ATATATAATT TCATAAAGTC 126 0 

40 

ATATTGAATT TCATATAAAG AGCAAACCCT AGAAAAGGAG GTGTTTGTGT. GAATTTATTA 13 20 

AGCCTCCTAC TCATTTTGCT GGGGATCATT CTAGGAGTTG TTGGAGGGTA TGTTGTTGCC 13 80 

CGAAATTTGT TGCTTCAAAA GCAATCACAA GCTAGACAAA CTGCCGAAGA TATTGTAAAT 1440 

45 

CAAGCACATA AAGAAGCTGA CAATATCAAA AAAGAGAAAT TACTTGAGGC AAAAGAAGAA 1500 

AACCAAATCC TAAGAGAACA AACTGAAGCA GAACTACGAG AAAGACGTAG CGAACTTCAA 1560 

AGACAAGAAA CCCGACTTCT TCAAAAAGAA GAAAACTTAG AGCGCAAATC TGATCTATTA 1620 

SO 

GATAAAAAAG ATGAGATTTT AGAGCAAAAA GAATCAAAAA TTGAAGAAAA ACAACAACAA 16 80 
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CGCATCTCCG GTCTCACTCA AGAAGAAGCT ATTAATGAGC AACTTCAAAG AGTAGAGGAA 1800 

GAACTGTCAC AAGATATTGC AGTACTTGTT AAAGAAAAAG AAAAAGAAGC TAAAGAAAAA 1860 

5 

GTTGATAAAA CAGCAAAAGA ATTATTAGCT ACAGCAGTAC AAAGATTAGC AGCAGATCAC 1920 

ACAAGTGAAT CAACGGTATC AGTAGTTAAC TTACCTAATG ATGAGATGAA AGGTCGAATC 1980 

ATTGGACGAG AAGGACGAAA CATCCGCACA CTTGAAACTT TAACTGGCAT TGATTTAATT 2040 

10 

ATTGATGACA CACCAGAAGC GGTTATATTA TCTGGTTTTG ATCCAATAAG AAGAGAAATT 2100 

GCTAGAACAG CACTTGTTAA CTTAGTATCT GATGGACGTA TTCATCCAGG TAGAATTGAA 2160 

GATATGGTCG AAAAAGCTAG AAAAGAAGTA GACGATATTA TTAGAGAAGC AGGTGAACAA 2220 

15 

GCTACATTTG AAGTGAACGC ACATAATATG CATCCTGACT TAGTAAAAAT TGTAGGGCGT 2280 

TTAAACTATC GTACGAGTTA CGGTCAAAAT GTACTTAAAC ATTCAATTGA AGTTGCGCAT 2340 

2Q CTTGCTAGTA TGTTAGCTGC TGAGCTAGGC GAAGATGAGA CATTAGCGAA ACGAGCTGGA 24 00 

CTTTTACATG ATGTTGGTAA AGCAATTGAT CATGAAGTAG AAGGTAGTCA TGTTGAAATC 24 60 

GGTGTAGAAT TAG CG AAAAA ATATGGTGAA AATGAAACAG TTATTAATGC AATCCATTCT 2520 

25 CATCATGGTG ATGTTGAACC TACATCTATT ATATCTATCC TTGTTGCTGC TGCAGATGCA 25 SO 

TTGTCTGCGG CTCGTCCAGG TGCAAGAAAA GAAACATTAG AGAATTATAT TCGTCGATTA 264 0 

GAACGTTTAG AAACGTTATC AGAAAGTTAT GATGGTGTAG AAAAAGCATT TGCGATTCAG 2700 

30 GCAGGTAGAG AAATCCGAGT GATTGTATCT CCTGAAGAAA TTGATGATTT AAAATCTTAT 2760 

CGATTGGCTA GAGATATTAA AAATCAGATT GAAGATGAAT TAGAATATC C TGGTCATATC 2 820 

- AAGGTGACAG TTGTTCGAGA GACTAGAGCA GTAGAATATG CGAAATAATT TTTGTCTCCC 2 8 80 

55 TCACAAATTA GTGAGGGAGC TTTTTTAAGT TGTAGTCTTA AtCTAGTTAG- AGAGCACTTT 2940 

ATCGGTAATA ACTATATTAA ACAGTAGTTA TTTGAAAGTA AGACGGACCT TATATTAAAT 3 000 

AAGAAGTTAT TGCTTTTAAT AAAAATGTTT TAGGCTTCGT AATTACTATA TTTATATTAT 3 060 

40 

GTAAACCTAT AAAGATGATT GGTTTTCTAT CCAATAAAAA AGAAGAGAAG ATGTAACAGA 3120 

TCTTCTCTTC yG CAAT ATT A ATTAGGATTT ATTTCTAAGT TGAGTTATTT TAATTGTAAA 3180 

TCTGTTTTCT TTAATTCTTT TATAACTTCT GCAGTATCAT AACAATTTGT TGCAATTGTT 3240 

45 

GAATATCTCT CTG CTAAACG ATATGCATTA ATGTAAAGCT TTAAACTTTC TTTAGCTATA 3300 

TCCTCTGCAT CTTCGAATTT TGATGGGTTA GACATAACCA CTAATTCTGC AAATTTTTCT 3360 

GGATCAATAT TAATAGACAT GTATTTATTT ACAACTCCTA TTTATTTTGA TGTCTTAATA 3420 

50 

CTAACATATT GAAGTTTTCA GACAAAGTAA TGTCTCTCTA TAATTGAAGA AAAATAATTC 3480 
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GGATGAACAA AACATGAGAA TAATGTTTAT AGGGGATATC GTAGGTAAAA TTGGACGAGA 3600 

CGCAATTGAA ACGTACATAC CTCAACTGAA GCAAAAGTAT AAACCAACAG TTACAATTGT 3660 

5 AAATGCTGAA AATG CAGCAC ATGGTAAAGG TTTGACTGAA AAAATATATA AACAATTACT 3720 

AAGAAATGGT GTAGATTTCA TGACTATGGG TAATCACACA TATGGTCAAC GTGAAATTTA 3780 

TGATTTTATA GATGAAGCAA AACGACTAGT AAGACCAGCG AATTTTCCGG ATGAAGCGCC 3 840 

10 

GGGAATTGGT ATGAGATTTA TACAAATTAA TGATATTAAA CTTGCAGTTA TTAATCTGCA 3 900 

AGGAAGAGCG TTTATGCCAG ATATTGATGA TCCTTTTAAA AAGGCAGATC AATTAGTCAA 3960 

GGAAGCACAA GAACAAACTC CGTTTATATT TGTTGATTTT CATGCAGAAA CAACTTCTGA 4020 

15 

AAAGTATGCA ATGGGATGGC ATTTAGATGG TAGAsTAGCG CTGTTGTTGG AACGCATACA 4080 

CACATTCAAA CAGCAGATGA ACGTATTTTA CCAAAGGGGA CAGGGTATAT AACGGATGTT 4140 

GGTATGACAG GTTTTTATGA TGGCATTTTA GGAATAAATA AAACAGAGGT AATTGAGCGT 4200 

20 

TTTATCACTA GTTTGCCACA AAGACATGTT GTTCCAAATG AAGGTAGAAG TGTATTATCT . 4260 

GGTGTTGTTA TTGATTTAGA CAAAGAAGGT AAAACAAAGC ACATCGAACG TATATTGATA 4320 

2s AATGATGACC ATCCATTTTC AACATTTTAA AATTACGTAA GTAAACATTC GAATTGGACC 4380 

CTATCGTCCA TTAGTATGAA TTTAATATAG TACCACTGTT TACATAGTAA ATCGGTGGTT 4440 

CTTTTTGTTA TCATTTAATA TGAAATATAT C CAT AGG AGG CATATAACTA TGAAACCACA 4500 

30 ATTATCGTGG AAAGTTGGCG GTCAACAAGG CGAAGGTATT, GAATCAAGTG GGGAAATCTT 4560 

CGCTACGGCT ATGAATAGAA AAGGATATTA TTTATATGGA TATAGACATT TTTCAAGTCG .4620 

TATCAAAGGT GGACATACGA ATAATAAAAT TAGAGTTTCT ACGACGCCTG TTCATGCAAT 4680 

35 TAGTGATGAT TTAGATATTT TGATTGCATT TGACCAAGAA ACAATTGATG TTAACCATCA 4 74 0 

TGAAATGAGA GAAGACAGTA TTATTTTArC TGATGCCAAG GCTAAACCTG TGAAaCCAGA 4 800 

AGGATGTCAT GCACAGCTTA TTGAATTACC TTTTACAGCA ACCGCTAAAG AATTAGGTAC 4860 

40 AGCATTAATG AAAAACATGG TTGCAATAGG TGCTACTAGC GCATTGATGA ATTTGAATAC 4920 

AAATACATTT GAAGAACTTA TTACTAATAT GTTTTCTAAA AAAGGTGACA AGGTAGTTGA 4980 

AGTCAATATC CAAGCATTAA ACGAAGGTTA TCAATTAATG CAATCTCGCT TACCTGAAAT 5040 

45 

CTACGGGGAC TTTGAATTAG AGTCAACAGA TGCACTACCA CATCTATATA TGATTGGTAA 5100 

CGATGCCATT GGATTAGGTG CAATTGCTGC AGGTTCACAA TTTATGGCGG CATATCCTAT 5160 

TACACCTGCG TCTGAAGTTA TGGAATATAT GATTGCCAAT ATATCTAAAG TAAACGGAGC 5220 

SO 

GGTTATTCAA ACAGAAGATG AAATTGCTGC TGTAACTATG GCTATTGGTG CAAATTATGG 5280 
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TGGATTATCT GGTATGACTG AAACGCCATT AGTCATTATT AATACCCAAC GAGGTGGACC 54 00 

TTCTACTGGA TTACCTACGA AACAAGAACA GTCAGATTTA ATGCAAATGA TTTATGGTAC 5460 

5 

ACATGGTGAT ATTCCAAAAA TTGTTGTAGC ACCAACAGAT GCAGAAGATG CATTTTATTT 552 0 

AACTATGGAA GCATTTAATT TAGCAGAACA ATATCAATGC CCTGTTATAG TTGTAAGTGA 558 0 

10 TTTGCAATTA TCTTTAGGTA AACAAACTGT TGAAAAATTA GATTATAATC GTATTGAAAT 564 0 

TAAACGTGGT GAAATCATTC AATCTGATAT TGAACGTGAA GAAGATGATA AAGGTTATTT 570 0 

CAAGCGTTAT GCGTtAACAT CCGATGGTGT TTCTCCTAGA CCTATCCCCG GTGTTAAAGG 5760 

15 AGGTATTCAT CATATAACTG GTGTGGAaCa CAATGAAGAA GGTAAACCTA GTGAATCTGC 5820 

GTCAAATAGA CAACAACAAA TGGAAAAACG AATGCGTAAA ATTGAGCAGT TACTAATTGA 5880 

ATCGCCAGTA GAAGCTAACT TACAACATGA GGATGCAGAT ATTCTTTATA TCGGTTTTAT 5940 

20 

TTCTACAAAA GGTGCAATTC AAG AAGGT AG TAACCGTTTG AATCAACAAG GCATAAAAGT 6000 

TAACACTATA CAAATTAGAC AATTGCATCC ATTCCCAACA AGCGTTATTC AAGATGCAGT 6060 

2S TAATAAAGCG AAGAAAGTCG TTGTAGTGGA GCACAATTAT CAAGGACAAT TGGCTAGTAT 6120 

TATAAAAATG AATGTCAATA TTCATGATAA GATTGAAAAT TATACAAAGT ATGATGGGAC 6180 

ACCTTTCCTA CCACATGAAA TCGAAGAAAA AGGCAAAATA ATTGCTACTG AAATAAAGGA 6240 

30 GATGGTATAG ATGGCGACAT TTAAAGATTT TAGAAATAAT GTTAAGCCTA ACTGGTGCCC ' 6300 

CGGATGTGGC GATTTCTCAG TACAAGCTGC AATTCAAAAA GCAGCCGCAA ATATAGGGTT 6360 

AGAACCTGAA GAAGTAGCTA TCATCACCGG TATAGGATGT TCTGGCCGTC TTTCAGGATA 6420 

35 

TATTAATTCT TATGGCGTTC ATTCTATTCA CGGACGTGCA TTACCTTTAG CTCAAGGTGT 6480 

AAAAATGGCG AATAAAGATT TAACTGTTAT TGCATCGGGA GGAGATGGTG ATGGTTATGC 6540 

TATAGGTATG GGG CAT A CAA TCCATGCTTT AAGAAGAAAT ATGAACATGA CGTATATAGT 6600 

40 

CATGGATAAT CAAATTTATG GTTTGACAAA GGGACAAACA TCGCCGTCAT CAGCAGTAGG 6660 

ATTTGTTACT AAAACAACGC CAAAAGGTAA TATAGAAAAA AATGTTGGGC CTTTAGAATT 6720 

45 AGTATTATCA TCTGGTGCCA CATTTGTAGC CCAAGGTTTT TCAAGCGATA TTAAAGGATT 6780 

AACAAAACTA ATTGAAGATG CAATTAATCA TGATGGATTT TCATTCGTTA ATGTCTTTTC 6840 

ACCATGTGTG ACTTATAATA AAATTAACAC ATACGATTGG TTTaAAGAAC ATTTAACAAG 6900 

50 

TGTTGATGAc ATTGAAAATT ATGATTCTAC AGATAAACAA TTAGCGACTA AAACTGTTAT 6960 

TGAACATGAA TCTTTAGTAA CTGGTATTGT TTATCaAGAT AAAGAAACAC CATCATATGA 7020 

ATCtCAAATT AAAGAGTTAG ATGATmCACC ACTTG CTAAA AGAGATATCa AAATTaCTGA 7080 
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5 



10 



20 



TGTATTTATA 


ACAGATCCAT 


TTATGCTACT 


CAGTTTTTTA 


CTATTACAAA 


AAATAAAGGA 


7200 


GTTTTTAAAA 


ATGAAAGACA 


CATTAATGAG 


TATACAAATA 


ATTCCTAAAA 


CACCAAACAA 


7260 


TGACAATGTT 


ATACCTTACG 


TAGACGAGGC 


GATTAAAATA 


ATTGACGAAT 


CTGGTTTGCA 


7320 


TTTTAGAGTA 


GGTCCGTTAG 


AAACGACAGT 


ACAAGGAAAT 


ATGAATGAAT 


GTTTAATTTT 


7380 


AATACAATCA 


TTAAATGAAC 


GAATGGTGGA 


ACTTGAATGT 


CCAAGTATTA 


TTAGCCAAGT 


7440 


TAAGTTTTAT 


CATGTGCGAG 


ATGGCATCAC 


TATTGAAACT 


TTAACTGAAA 


AATATGATGA 


, 7500 


ATAACATTAA 


AAGTGAAGTA 


AACTGGATTT 


GAATTGGCTT 


/ ii 1 H p 7\ 7\ TV III/' 

GTTAGAGATG 


ACblAIAALi 


/SOU 


TTAACTGTTT 


TTGCACTTTA 


TAGTTAAATT 


TAATATAATT 


ATTAAATGAT 


ACGGGCAAAT 


7620 


AGAAAGGATT 


TTGTAAAGTG 


AACGAAGAAC 


AAAGAAAAGC 


AAGTTCTGTA 


GATGTTTTAG 


7680 


CTGAGAGAGA 


TAAGAAAGCA 


GAAAAAGATT 


ATAGTAAATA 


TTTTGAACAT 


GTTTATCAGC 


7740 


CGCCTAATTT 


AAAAGCAAGC 


GCAAAAAAAG 


AGGTnAAA 






7778 



(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1128 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 

AGATGAAGTT GTTACgAAAA TTGCGTACGC TGTTTCAGAA CATGTCAAAA TAGAAACAGG 60 

TAATCCATTC TTTCAAACAT CACATAGTGG TTGTGCGACG GGCGGATCCT GTAATTGTTC 120 

ATTATAAAAA ACATCGAGTC AGAAAAAGGT GGTTATTGAA cCACTAACTA GCATCTGACT 180 

CGATGTTTTT ATTTATTCGG GATTGTTTGT TTGAATTGTT GTG CTAAATC TGGTCGATCT 240 

GTCACAATCG TGTGTGCACC TTTTTGGTAT AAATCATTCA TCAGATTTAT ACTATTTACG 300 

CCATAATAGC CTGGAATGAT ATT CAT AT CA TTTAACCATT TGATAAAACG AGATGAAGTC 360 

AAATCAATGC CTTTAAAATG AGTAGGCATT TGGAACGTTT GTGCTAATGG TTGGTAGTAC 420 

CTACCACCTA ATAAATGATA TTTTAAAAAT GCTTCTGTAA CTTCCTGTTG GCTAGCACCA 4 80 

ATTGCGACGG ATCCTTGTGC AATTTTATTA AAACGAACGA TTTGTTCTTT ATAAAAACTT 540 

GTCACAAGAA CGCGGTCAAA TGCTTGATTT TCTGCAATTG TATCAAACAT AATTTGTGGT 600 

GCGATTGAGC CTTCATAGGA TTCAGGAGCA TCTTTTAAGT CTACGTTTAT ATACATATCA 66 0 

GGATATTGCT TCAGCAACTc ATCGAAGGTT AGTATAGCTG TGTGTGCATG ACCACGATAT 720 
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10 



AATGTATGGG CACTAACTTT TCCAGAGCCG TTCGTCGTTC TATCAACAGT TGCGTCATGA 840 

AAAACGATAA GCTGTTGATC TTTTGTGAGT CTCACATCTG TTTCAAAGCC ATCAACGCCT 900 

AATTGTTTAG CATAGTCAAA TGCAAGTTGC GTTTGCTCTG GTCTTAAAGC CATACCACCG 960 

CGATGGGCAA ATATATATGG TGCATTGGCT TTGAAAAAAG CAGGGATGGT TTGCTTTTTA 1020 

GTAATCACTT TATTTTTATT GATCATTAAT AGACTACTTA AAAATCCAGC ACCGACTAGT 1080 

ACCGCATTTA AAATGTTTCT GTTTACnTTT TTCATAAAAA ATTCCTCC 1128 
(2) INFORMATION FOR SEQ ID NO: 50: 

75 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6252 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 
CAAGCAAACA ATCGTCGATA AAATTGCTAA AATAATAAAA GTAATTCGAA CTTTCATCAT 60 

25 

GATCATCCTT TGTTTATAGA GTCAATATAA GTATGGAATA TGTTAGGTAT ATAGTCAAAT 120 
GCGTCAACTA ATGGGAATTT TGGCATAGAT AGAGAATTTA AGGCAATTAA AAAGGCATCA 180 
30 AACAGTAATA TGCTGCTTGA TGCCCAAATG ATGACTTTAG CTAAATTGAT TAGTCACTTT 240 

TAAAGATAAA GAATTGTCAT GAATTAAAAC TCATGTAATG ATGTGTTACA TTTCGCAATG 300 
ATGGCTTTCA GTTATTTATC GATAACATCA CTCTTGATAC CTTTAGATTT TAAGAAATCT 360 

35 

TTAATTTTAT CTTGTTGCTT TTTATTAACA TCACCGGCAT ATTTTGTTGG CACGTCGACA 420 
ACATTGATTT TATTTTGCGG TTGATAGCTA AGCTTTTCAA TATCTTCATC AACATTGGCG 480 

ATTOTACTAT TTAAAGCTTT GAAGTAATTC ATCATTAATT CAACGGGTTT CTTATATTCT 540 

40 • 

TTAGGAATAT TGTTTTCAGT GACAAATTTC TTGAAATGCA AATCGTTTTT AACAGCTAAG 6 00 

TTAGATAAGT GGCTAAGTGT TTCTGCTTGT TTTTCAGTCA CTTTTGTTTG ACTGTCAATT 660 

45 TGTTTATCTA GTTTATGTTG CATAATATAT TTGTTATCAA GTATATCGCT ATTTACAGAC 720 

AAATACTTTT CTATAGCTTG CTTCATCTCT GCATCACTAA TATCACTATT TTTCTTATCT 780 

GAGTTAAAGA TATCTTTTGT t TCTAATTTT TTAGCGCTTT TAGGTGCATG GATGCCAGTA 840 

50 CTTGTATGAT GATCTTCGTT ATCAGATTGA TCGGACGCGC AACCTGTAAG AATTAATGTC 900 

GATGCTAAAA ATGTACTTAG TAGTAATCTC TTTTTCATAA TGTAATATAA CTCCTTAGTT 960 

TATCTTTAAT TGAAAAAATA TGTATTCATG TTTAATAGAG TAACATTGAA TTAGTTTGGA 1020 
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TCTATCAATA ATGCATCATT TTGGACGTTG TTAAGGATAG CTTTATCTAT AAATAAGTGC 114 0 

ATAATTGGTT GTACTAATTT AGACGTAGGT ATCGTACGTA AAAGGATAAT AATTTCGTTC 1200 

5 

ACATACTTTT CTTTCTCAAT ATCATTTTTC ATATTGATTT GTTTGCGAGA GGTACATACT 1260 

TTAAGCATTA TCGCACATCT CGTTGTATAT ATTAAGTTTA TCATAACATG ATTTTATGTC 132 0 

1Q GGGATAAAAA AATAACAGCA TCTTAACAAA TGTAAG ATAC TGTCAGTGAA ATGAATGAAA 1380 

GTTTAGTTTC TGaTAATATA GTCAAAGGCA TTTAATGGTG GATTTGCAGC AGCGCCGATT 1440 

GAAATGATAA TTTGTTTGTT CTTCTGATCT GTGACATCGC CAGCAGCAAA TATTCCAGGA 1500 

^ ACATTCGTAT TATTGTTACG ATCAATCACA ATTTCACCAC GTT CGTTTAA TTCAACAGCA 1560 

TCGTTTAACC ATGATGTGTT TGGAAGTAAA CCAATTTGAA CAAAG AT AC C ATCTAAGTTA 1620 

AGTAGATGTT CTTCGCCGGT GTTCATGTCT TCGTAACGTA TACCTGTAAC ATGGTCTTCT 1680 

20 

CCGACAACTT CAGTAGTTTT GGCATTTGTT TTGAT AT CAA CATTTGATAA AGAACGTAAA 174 0 

CGATCTTGTA ACACGTTGTC TGCTTTTAAT TCGCTAGCGA ATTCGAATAA TGTAACATGA 1800 

TTAACGATAC CAGCAAGGTC AATTGCTGCT TCAACCCCAG AGTTACCGCC ACCGATAACT 1860 

25 , 

GCTACGTCTT TATTTTCAAA TAGAGGTCCG TCACAGTGAG GGCAGAATGC AACACCTTTA 1920 

TTAATCAATT GCTCTTCACC TGGAATGTTT AGCTTACGCC AACCTGCACC AGTAGCAATA 1980 

30 ATGACTGTTT TACTTTCTAA GACAGCACCG TTTTCTAACG TAACTTTAAT TGCTTCGTCA . 204 0 

GTCTTTTCGA TATCTGTAGC ACGTATACCT GTCATTGCAT CAATGTCATA TTGATCAATG 2100 
TGCGCTGCTA AGTTAGAAGA AAATTCAGAA CCAGTTGTTT CTTTAACAGT AATGAAGTTC - 2160 

35 TCAATACCAG CAGTATCATT AACTTGGCCA CCGATACGAT CAGCAACTAT ACCAGTACGT 2220 . 

AAACCTTTAC GTGCTGTGTA AATCGCTGCA CTACCACTAG CAGGACCACC ACCAACGATT 2280 

AAGACATCAT AAGGTTCTTT ATTTTCAAAC TCAGATGCAT CTGCCGTACT GCCTAGTTTC 2340 

40 * 

GAAAGAATAT CTTGGATTGT CATACGACCA TTGCCAAATT CTTCGCCATT TAAAAAGACA 2400 

GCAGGGACTG CGATGATGTT TTCAGATTCT TCACGGAACA CTGCACCATC AATCATAGAA 2460 

45 TGCGTGATGT TAGGGTTGAT CACACTCATT AAGTTAAGTG CTTGAACGAC ATCAGGACAT 2520 

TTTTGACACG TTAAACTAAT GAATGTTTCA AAATGGAATG AACCTTCTAA TTTTTTAATT 2580 

TGGTCAATGA TTGACTGTTT TTCTTTAGGT GCACGACCAC TAACCTGTAA AATTGCTAAA 2640 

50 ACAAGTGAGT TAAACTCGTG ACCTAATGGA ATACCTGCAA ATGTTACACC TGTTTCTTCG 2700 

CCAGGACGAT TGACTGAGAA ACTTGGTGTA CGTTTTAAAG ATTTTTCAGA AAGAGATAGT . 2760 

CTAGGTGACA TATCAGTAAT TTCTGTCAAC AAATCTTTAA GTTCTTTGGA TTTATCATCT 2820 
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TGTTGTTTTA AATCAGCATT AAGCATGGTT GTAATGCCTC CTTAGATTTT ACCTACTAAA 2940 

TCTAAACCAG GTTGCAATGT TTTAGCGCCT TCTTCCCATT TAGCTGGGCA TACTTCGCCA 3000 

GGGTTTTTAC GAACATATTG AGCTGCTTTG ATTTTGTGAG CTAATGTACT AGCGTCACGG 3060 

CCAATTCCGT CAGCGTTAAT TTCAGATGCT TGTACAACAC CGTCTGGGTC GATAATGAAT 3120 

, GTACCACGTT GAGCTAAACC AGTAGCTTCA TCTAATACAT CAAAATTACG AGTGATTGTT 3180 

TGTGATGGGT CACCAATCAT AGTGTAAGTG ATTTTGCTAA TTGCATCTGA ATGGTCATGC 3240 

CATGCTTTGT GTACGAAGTG AGTATCAGTT GATACTGAGA ATACATTTAC GCCTAATTTT 3300 

15 TGTAATTCTT CATATTGGTT TTGTAAGTCT TCTAATTCAG TTGGACAAAC GAATGAGAAG 3360 

TCAGCAGGAT AGAAGCATAC TACGCTCCAA GAACCTTTTA AATCTTCTTG TGTAACTTCT 3420 

TTAAATTGAT CTTTTTTTGG ATCGAAArCT TGCGCTGTAA ATGGTAAGAT TTCTTTGTTA 3480 

ATTAATGACA TAAATATCTT CCTCCTAAGA ATTTAAGTAT GAATTAGAAC TATCAATTGA 3540 

TTGCGCTTAA TTATAATAAT TCTAATCTCT TAGTTAGCAT TATTACATTT TGATCCAGAA 3600 

TAGTCAACTG GATAACTTTG TAAAGTGAAT GATTACTTTT AAAATAAAGA AAGATAATAT 3660 

AAAGTGCTTT GATAATGGAT TTTGTAGTTG ATGATTTAAA AGGTTGTGTC TATATTTAAT 3720 « 

ATCTTGATTT TAATGTAAAA AATGTAAAAA AAGAAGATTT GTATTCTCAA CTAAGTCAAC 3780 : 

30 CTTATTGATA ATGGTATGAG AATATTTGTT CGAGATGGAT GAAGGTAATG AGTGAGAAAC 3840 - 

TGGATTTTTA AAGTATGAGA CAATATTTTA AAAAGTTCAA TTATTAACTT ATAAGCAAAT 3 900 

AATTGCTATA AAAAAGTTTG GACGTGTACA ATTGCAATAT GAAGATTTTA AATTAATTGT 3960 

AAAGTATCGA GGAGTGGGTA ACGTGTCAGA ACATGTATAT AATCTTGTGA AAAAGCATCA 4020;' 

TTCTGTTAGA AAATTTAAGA ATAAACCTTT AAGTGAAGAC GTTGTTAAGA AATTGGTAGA 4080'^ 

AGCT33GACAA AGCGCTTCGA CGTCAAGTTT CCTGCAAGCA TACTCAATTA TTGGTATCGA 4140 

CGAlGAGAAG attaaagaaa ATTTACGAGA AGTTTCTGGA CAACCTTATG TTGTAGAAAA 4200 

TGGCTATTTA TTCGTCTTTG TTATTGATTA TTATCGTCAT CATTTAGTTG ATCAACATGC 4260 

45 TGAAACTGAT ATGGAAAATG CATATGGTTC AACGGAAGGT TTGCTAGTAG GTGCAATCGA 4320 

TGCAGCATTA GTTGCCGAAA ATATTGCGGT AACTGCTGAA GATATGGGGT ATGGCATTGT 4380 

CTTTTTAGGA TCATTAAGAA ATGATGTTGA ACGCGTTCGA GAAATTTTAG ACTTACCTGA 4440 

50 

CTATGTCTTC CCGGTATTTG GTATGGCAGT AGGGGAACCC GCAGATGACG AAAATGGTGC 4500 

AGCCAAGCCA CGGTTACCAT TTGACCATGT CTTCCATCAT AATAAGTATC ATGCTGATAA 4560 

GGAAACACAG TATGCACAAA TGGCAGATTA CGACCAGACA ATCAGCGAGT ACTATGATCA 4620 
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CAAAGCAAGA TTAGATATGT TAGAACAATT GCAAAAATCA GGCTTAATAC AGCGATAgCA 4 740 

AGATACCAAA ATAACCCGCC CCCCTCTAGC TTAAAATGAT AAGTATAGCT AGAGGGGGCG 4 800 

GGTATTTCTT GCAATGAATT AGTGTGAAGT TAATGCAGCA TTATCATTTG AATCGAAAGT 4 860 

ATCTTTATCC CAATGTTTAG TTAACTTGGC GGTACCTGTA CCAGCTAGCA TTGAATCGTT 4 920 

CACGTTTAAT GCTGTT CTAC CCATGTCAAT CAATGGTTCA ACGGAGATGA GCACGCCGGc 4 980 

. TAAAGCGACT GGCAAGTTTA ACGTTGACAA CACCAATATG GATGCAAATG TAGCCCCGCC . 5040 

ACCGACGCCA GCAACGCCGA ATGAACTAAT AATCACGACA GCGATTAACG TTACAATAAA 5100 

75 TTGTAAATCA ATTTCTACAT TAGCGACX3GG TGCGACCATA ATTGCAAGCA TGGCAGGGTA 5160 

AATGCCTGCA CAACCATTTT GTCCAATCGA CAATCCAAAT GTCGCAGCGA AATTGGCAAT 5220 

ACCTTCTGGC ACGCCTAGAC GTCTTGTTTG TGTTTGTACA TTCAATGGTA AGGCACCCGC . 5280 

20 GCTTGAGCGT GATGTGAATG CAAAGATTAA TACTTCCAAA GTCTTTTTAA CATAGCGAAT 5340 

TGGGCTAATA CCTAACAGGC TTAAAATAAT TAAGTGAATG ATATACATCG TAATTAATGC 5400 

AGCGTACGAT GCGATTAAGA ATTTTCCTAA AGTCCAAATG GCGCCAAAGT CACTTGTCGA 5460 
26 ' ' 

TAATGTGTTG GCCATAATTG CTAATACACC GTATGGCGTT AAACGTAAGA CGAACGTCAC 5520 

AATCGCCATT ACTAGTGAAT AGATAGCGTC AATCGCACGC TTAAGCAATT CACCATGATC 5580 

AGGTTGTTTG CGTnTACGCG TAAATAAGCA AATCCTATAA ACGAAGCAAA TAT CACGACA 5640 

30 ■ , . ■ 

GCAATCGTGG aAGTTGCACG TTGTCCaGTG AAATCTAAGA ATGGATTTTT AGGCAATAAT 5700 

TCCAAAATTT GTTGTGGTAA CGTATGTGCT GTTAAATCTT TCGCTTGTTT AGCAATTTCG 5760 

35 CTTCCACGTG CTTGTTCAGG GTTACCAAGG TTAATTGTTG ATG CATCTAA ACCAAACACC 5820 

AAGGCATACA CAACACCAAC AATCGCAGCA ATGGTGACAG TGCCAATTAA" AAAGATAAAA 5 880 

ATGAGACTAC CAATTTTAGC AAACTTTTCT CCGATTTGAA TTTTAGTGAA TGCAGCTACA 5940 

40 - 

ATAGAAATGA AAATTAAAGG CATAACAATC ATTTGCAACA ATGCAACGTA ACCTTGTCCG 6000 

ACAATGTTGA ACCAGTCACT TGTTGATGTA ATAACATTCG AATGTGTGCC ATAAATAAGA 6060 

TGCAATAACA CAC CGAATAC TATACCAATC CCTAAAGCTG TAAACACACG TTTCGCAAAA 6120 

GATATATGTT TGCGAGCCAT CATGTGCAAT ATTACGATGA AAATCACCAA TACAATAATA 6180 

TTAATCAGTG TAAGAAAAGC ATTCATGAAC GTCACTCCTT AAATTTTTGA ATATAATTCC 6240 

GACTAGTATG CT 6252 

(2) INFORMATION FOR SEQ ID NO: 51: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6730 base pairs 
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(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 



ATCAAATCnC AAAATATTTA TTAATnAnAA GGGGATTATC CaTGTgAGAA ACAAAGTAAT 60 
GCTCTTTTTT TACCTCTTGT GGGTTGAAAA aTGGATCATC AGAGATAGAC TTCTTCTTTT 120 



TCGAAGATGA CATTTGATAC TTTAATCTTC TAAAACCATA ACTTGTCGCA TCAAAAATGC 180 

CTTCTTGTAC AAGTAAAATC AAAAATATGC TAATAAAAAT AATTAATGAA ACATAAAACA 240 

75 ATATATTTAA ATATGTAATG ATAGTATGGC TATTAAAAAG CCATATAATA AACGTTAATA 300 

TTGGCGTTAT TAGTGCCATT CCAAGCCATT TTTTCAACAT TTGATCACTC CCACTTATAG 360 

AAAACTCTTA CGCATAGTTT ACATTAAAAT CAGACATTGA GGAATGATTT TTTAATTTCT 420 

TCAGCTTTAT TGAAATTCTA AAATCAATCA TTCTTCATTA GTTTAAAGCA AAAAAATATT 4 80 

GATATAT AGT AAATATTGTA TATATAATAT TAGTTAAGAT TTCaGAAAAT TTTGAAGGGA 540 

ATGGAAATTT AGAAATCGGA ATTTGTTAGA GGAGGGGATT AGATGGGGAA ATATATTTTC 6 00 

AAACGATTTA TTTATATGCT TATTTCTTTA TTTATTATTA TTACAATTAC ATTTTTCTTA 660 

ATGAAATTAA TGCCAGGTTC GCCATTTAAC GATGCTAAAT TAAATGCTGA ACAAAAAGAA 720 

30 ATTTTAAATG AAAAATATGG ATTAAATGAT CCTGCAGCTA. CGCAgTATTT ACATTATTTA 780 

AAAAATGTTG TTACAGGCGA TTTTGGTAAT TCATTCCAGT ATCATAATCA ACCTGTGTGG 840 

GATTTGATTA AACCGAGACT ACTACCTTCT TTTGAAATGG GTCTTACAGC AATGTTC a TC 900 

GGTGTGATAC TGGGACTTAT TTTAGGTGTT GCAGCAGCTA CTAAACAAAA TTCTTGGGTT 960 

GACTATACAA CTACAGTTAT TTCAGTTATT GCAGTATCTG TACCATCTTT TGTACTTGCT 1020 

GTACfTTTAC AATATGTATT TGCAGTTAAA TTAAGATGGT TCCCAGTAGC TGGATGGGAA 1080 

GGTTTTTCGA CCGCGGTATT ACCGTCACTT GCATTATCTG CAGCTGTTTT AGGAACTGTC 1140 

GCCAGATACA TAAGAGCAGA GATGATAGAG GTATTAAGTT CAGACTATAT TTTATTAGCG 1200 

45 AGAGCTAAAG GTAATTCGAC AATGCGTGTA CTTTTTGGAC ATGCACTTAG AAATGCTTTA 1260 

ATTCCAATTA TTACAATTAT CGTTCCCATG TTAGCAAGTA TTTTAACAGG CACTTTAACA 1320 

ATTGAAAATA TTTTTGGAGT TCCTGGATTA GGGGATCAAT TCGTACGTTC AATTACAACA 1380 

so 

AATGATTTCT CAGTAATCAT GGCAATCACA CTATTATTTA GCACACTGTT TATCGTTTCT 144 0 

ATTTTTATTG TAGATATTTT GTACGGTGTG ATAGATCCAC GAATTCGTGT TCcAAGgAGG 1500 

TAAAAAATAA TGGCTGAAAA TAAAAACAAT TTGTCGATTA ACGACGATCA TTCTAATGCA 156 0 
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TGAATCAGGA ACCTGAAATG CAACGAGAAA GCAAAAACTT TTGGCAAGAT GCTTGGGCTC 1680 

AGTTAAAACG AAATAAGTTA GGTGTTGTCG GTATGATAGG TTTAATTATC ATTGTAATAT 1740 
5 ~ 

TTGCTTTTAT CGGTCCAGTT ATAAATAAAC ATGATTATGC TGAACAAAAT GTAGAACATA 18 00 

. GAAATCTTCC GGCAAAAATA CCTGTATTAG ACAAAGTTCC ATTTTTACCT TTTGATGGTA 1860 

10 AAGATGCAGA TGG CAAGG AT GCTTATAAAG CAGCAAATGC TAAAGAAAAT TATTGGTTTG ; 1920 

GTACTGATCA GTTGGGTCGA GATTTATGGA CAAGAACATG GAAAGGTGCT CAAATTTCAT -1980 

TGTTTATCGG TGTTGTTGCA GCGATGTTAG ATATTTTTAT TGGTGTTGTA TATGGTG CGA 204 0 

15 TTTCTGGATT CTTCGGTGGA CGTGTCGATA CGATTATGCA ACGTATACTT GAAGTCATAG 2100 

CATCTATTCC GAATTTAATT GTCGTAATTT TATTTGTATT AATTTTTGAA CCATCCATTT .2160 

GGACAATTAT ATTGGCTATG TCTATCACAG GCTGGTTAGG CATGAGCAGA GTTGTACGTG 2220 

20 

GAGAATTTTT AAAATTAAAA AATCAAGAGT TTGTCATGGC TTCGAAAACA TTGGGGGCTT 2280 

CAAAATTCAA ATTGATATTT AAGCATATTT TACCTAATAC ATTAGGTGCT ATCGTGGTTA 234 0 

CATCAATGTT TACAGTACCT AGTGCTATTT TCTTCGAAGC ATTTTTAAGT TTCATTGGTA 24 00 

25 

TAGGTGTACC CGCACCTCAA ACATCGTTAG GGTCATTAGT AAATGATGGG CGCGCAATGT 246 0 

TATTAATTTA TCCACATGAA TTATTTATAC . CAGCAATGAT TTTAAGTTTA/.TTAATTCTAT, 2520 

30 TCTTTTACTT ATTTAGTGAT GGATTACGTG ATGCATTTGA TCCGAAAATG CGTAAATAAA . 2580 

AAGGGGGCAT AGCATATGAC TGAAAGAATA TTAGAAGTAA ATGATTTGCA TGTTTCCTTT 264 0 

GAT ATT A CAG CAGGGGAAGT GCAGGCAGTG , AGAGGCGTAG ATTTTTATTT : GAACAAAGGG 2700 

35 GAAACATTGG CAATTGTTGG TGAATCAGGT TCAGGTAAAT CTGTAACAAC AAAAGCAATT . 2760 

ACAAAATTAT TCCAAGGGGA CACAGGAAGA ATTAAAAAGG GAGAAATTTT ATTTTTAGGG 2820 

GAAGATTTAG CAAAAAAACC TGAAAATGAG TTGATTAAAT TACGTGGCAA AGATATTTCA 2880 

40 

ATGATCTTTC AAGATCCAAT GACATCTTTA AACCCAACGA TGCAAATTGG TAAACAAGTC 2 940 

ATGGAACCAT TAATTAAGCA CAAAAATTAT AGTAAAGCAC AAGCTAAAAA GCGCGCATTG 3000 

GAAATACTAA ATCTTGTAGG TTTACCAAAT GCAGAAAAAA GATTTAAAGC ATATCGTCAT 3060 

45 . ■ 

CAATTTTCAG GTGGACAAAG GGAAAGAATT GTTATTGCAA CCGCATTAGC TTGTGAACCT 3120 
AAAGTGCTCA TTGCTGATGA ACCAACGACT GCATTAGACG TAACGATGCA GGCACAAATT . 3180 

50 TTAGATTTAA TGAAAGAACT ACAACAAAAA ATCGATACAG CAATTATTTT TATAACGCAT 3240 

GATTTAGGGG TTGTTGCGAA TATTGCTGAT AGAGTGGCAG TTATGTATGG TGGTCAAATG 3300 

GTTGAAACAG GAGATGTTAA CGAAATATTT TATGATCCAA AGCATCCATA TACATGGGGA 3360 
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GGAGCGCCAC CTGATTTATT ACACCCACCT AAAGGTGATG CATTTGCGAG ACGTAGcAAT 34 80 

ATGCATTAGA TATTGATTTT AAAGTAGAAC CACCGTGGTT TAAAGTTTCA CCGACACATT 3540 

5 

TTGTGAAATC TTGGTTATTA GACGCACGTG CACCAAAAGT TGAACTACCC GAGCTGGTAA 3600 

AACAACGTAT GAAACCGATG CCTAATAATT ATGAAAAACC ACTCAAGGTA GAAAGGGTGT 3660 

w GGTTCAATGA AAAATGATGA AGTGCTATTA TCTATTAAAA ATTTAAAGCA ATATTTTAAC 3720 

GCAGGAAAGA AAAACGAAGT GgaGCGATTG AAAATATTTC GTTTGATATA TACAAAGGGG 3780 

AAACATTAGG TTTAGTAGGA GAATCGGGGT GTGGTAAATC TACAACTGGT AAATCAATTA 3840 

15 TTAAACTTAA TGATATTACA AGTGGAGAAA TTTTGTATGA GGGTATTGAT ATACAAAAGA 3900 

TTCGTAAACG TAAAGATTTG CTTAAATTTA ATAAAAAGAT ACAGATGATT TTTCAAGACC 3960 

CATATGCGTC TTTAAATCCT AGGTTAAAAG TAATGGATAt AGTAGCTGAA GGTATTGATA 4020 

20 

TCCATCATTT AGCAACTGaT AAGCGTGACC GAAAAAAACG TGTCTATGaT TTACTTGaAA 4080 

CTGTTGGATT AAGTAAAGAA CATGCCAATC GCTATCCTCA TGAATTTTCA GGTGGaCAAC 4140 

GCCAACGTAT TGGaATTGCC CGTGc ATT AG CCGTTGaACC AGAATTCATT ATCGCGGACG 4200 

25 

AACCAATATC GGCATTGGAT GTTTCAATCC AAGCTCAAGT AGTTAATTTA TTATTAAAAT 4260 

TACAACGTGA AAGAGGGATT ACGTTCCTAT TTATAGCTCA TGATCTATCA ATGGTGAAGT 4320 

30 ATATTTCAGA TCGTATTGCA GTCATGCATT TTGGGAAAAT *■ AGTTGAAATT GGACCGGCAG 43 80 

AAGAAATTTA TCAAAATCCA TTACACGATT ATACTAAGTC TTTATTATCA GCCATTCCAC 4440 

AACCTGATCC TGAATCAGAA CGCAGTCGCA AACGATTTAG TTATATTGAT GATGAAGCAA 4500 

35 AT AAT CATTT AAGACAATTA CATGAAATTA GACCGAATCA - CTTTGTCTTT AGTACTGAAG 4560, 

AAGAAGCGGC ACAACTACGA GAAAATAAAT TGGTGACACA AAATTAAGGG GAAGGGGGAA 4620 

ATGCAATGAC GAGAAAATTT AGAACACTTA TTTTAATTTT GATTGCTACA ATTGCATTAA 46 80 

40 

GTGGTTGTGC TAATGACGAT GGTATTTATT CAGATAAAGG TCAAGTATTC AGAAAAATTT 4740 

TGTCATCAGA CTTAACATCC CTTGATACAT CATTAATAAC GGATGAAATA TCTTCTGAAG 4800 

TGAcTGCGCA AACATTCGAA GGTTTATACA CATTAGGAAA AGGTGACAAA CCX^STGITAG 4860 

45 

GTGTTGCGAA AGCTTTTCCT GAAAAGAGTA AAGATGGTAA AACTTTAAAG GTTAAATTAA 4920 

GAAGCGATGC TAAATGGAGC AATGGTGACA AAGTGACTGC ACAAGACTTT GTTTATGCTT 4980 

60 GGAGAAAAAC AGTTGACCCT AAAACAGGTT CTGAATTTGC ATACATTATG GGGGACATTA 5040 

AAAATGCGAG TGATATTAGT ACTGGTAAGA AACCTGTAGA GCAATTAGGT ATCAAAGCAT 5100 

TAAATGATGA AAGATTACAA ATTGAATTAG AAAAGCCGGT TCCATATATT AATCAATTAT 5160 
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ACGGTACGGC AGCTGATAGA GCGGTATACA ATGGTCCaTT TAAAGTTGAT GATTGGAAAC .5280 

AAGAAGATAA AACCTTACTA TCTAAAAATC AGTATTATTG GGATAAAAAG AATGTAAAAT 5340 

5 

TAGATAAAGT GAATTATAAA GTTATTAAAG ACTTACAAGC CGGTGCATCA TTGTATGATA .. 5400 

CTGAATCAGT AGATGACGCA TTTATTACTG CAGATCAAGT AAAT AAAT AT AAAGACAACA 5460 

10 AAGGATTAAA CTTTGTGTTA ACGACTGGGA CATTTTTTGT AAAAATGAAT GAAAAACAAT 5520 

ATCCTGATTT TAAAAACAAA AATTTAAGAT TGsTATCGCA CAAGCAATAG ATAAAAAAGG 5580 

ATACGTTGAT TCAGTGAAAA ACAATGGCTC AATTCCTTCC GATACACTAA CAGCCAAAGG 5640 

15 AATTGCGAAA GCGCCTAATG GCAAAGATTA TGCGAGTACC ATGAATTCGC CTTTAAAATA 5700 

TAATCCTAAA GAAGCAAGAG CACACTGGGA CAAAGCTAAA AAAGAGTTAG GTAAAAATGA 5760 

AGTGACATTT TCAATGAACA CAGAAGATAC ACCAGATGCA AAAATATCTG CTGAATATAT 5820 

20 

CAAATCGCAA GTTGAGAAAA ATTTACCAGG AGTTACTTTG AAAATTAAGC AATTACCGTT 5880 

TAAACAAAGA GTATCACTAG AACTGAGTAA CAATTTTGAA GCATCACTTA GTGGTTGGTC 5940 

TGCAGATTAC CCTGATCCTA TGGCTTATTT AGAAACAATG ACCACAGGTA GCGCACAAAA 6000 

25 

TAATACAGAC TGGGGTAATA AAGAATATGA TCAATTACTT AAAGTAGCAA GAACCAAATT 6060 

GGCACTTCAA CCGAACGAAC GATATGAAAA CTTGAAAAAA GCAGAAGAAA TGTTCCTAGG 6120 

30 AGATG CACCG GTAGCAC CAA TTTATCAAAA AGGTGTtGCA CATTTaACAA aTCCTCAAGT .(6180 

AAAAGGATTA ATTtACCATA AATTTGGTCC AAATAACTCA CTTAAACATG TATATATTGA 6240 

TAAATGGATA GATAAAGAAA CAGGTAAGAA GAAAAAATAA TATGCTTTGT AAATTAGGCT 6300 

35 GGAGACATAT CTCCAGTCTT TTTGTGTTGG ATAAAAaCTT TGGGAATAAA AATTTAAAAT 6360 

AAGTCGTTTT TTAAATTACT GAAATTGATT AAATGCATAA ATAACTGAAT ATTCTAAAAA 6420 

TAAA&TTGTA ATAATTTTTT CTATGAGTAA ACTAAAAAGA AAAAATTAGA TTGAAAGTAG 64 80 

40 

GAGGCATATG TATGGGGAAG CTAATTAAAT ATATTTCAAT ACTTCTTATT GTCGTTTTAG 6540 

TGTTGAGTGC TTGCGGAAAA AGCAGTAATA AAGATGAAGG AGTAAAAGAT GCTACTAAAA 6600 

45 CGGAAACCTC AAAACATAAA GGTGGTACCT TAAATGTAGC ATTAACAGCA CCGCCAAGTG 6660 

GTGTTTATTC TTCGTTATT A AATAGTACAC ATGCAGATTC TGTAGTTGAG GGATATTTTA 6720 

ACGAAAGCTT 6730 

50 (2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6482' base pairs 

(B) TYPE: nucleic acid 
55 (C) STRANDEDNESS : double 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

AATTTTTGTC ATTATTAAAA ACCTCGCTTT TAAAAGATTG AAAAGTAAAT GAGTGAAATT 60 

5 

AAAGATTATG CACATTAAAA TCACGCCACA ATTTAATTGT GAAAAATATC ACAAATATAT 120 

TATAACACTA AATTTCCCAA AATTCAAAAG TGTGTTTTAT TGCAGAAAAC TTATAACAyG 180 

10 TGCACAAGTT ATAGTGAATT GCAAACGGAT TACTTTAGTC TTTTTAAAAC ATGAAGTATA 240 

ATTTGTATAG CAATAAATAT AAAAATGGGA GGCTATGTTC AATG AG CAAT ATGAATCAAA 300 

CAATTATGGA TGCATTTCAT TTCAGACATG CGACTAAGCA ATTCGATCCA CAAAAGAAAG 360 

TTTCGAAAGA AGATTTTGAA ACAATATTAG AGTCAGGTAG ATTGTCTCCA AGTTCTCTTG 420 

GGTTAGAACC TTGGAAGTTT GTCGTGATTC AAGATCAAGC GTTACGTGAT GAATTAAAAG 480 

CGCACAGTTG GGGCGCAGCA AAACAATTAG ATACAGCGAG CCATTTTGTG CTAATTTTTG 540 

CGCGTAAAAA TGTAACGTCA AGATCACCGT ATGTACAACA TATGTTAAGA GATATTAAAA 600 

AATATGAGGC ACAAACGATT CCAGCTGTTG AACAAAAATT CGATGCATTC CAAGCAGATT 660 

TCCATATTTC TGATAATGAT CAAGCCTTGT ATGACTGGTC AAGTAAACAA ACGTATATTG 720 

CATTAGGCAA TATGATGACG ACAGCCGCAT TGTTAGGTAT TGATTCATGT CCGATGGAAG 780 

GTTTTAGTCT GGATACAGTG ACAGACATTT TAGCAAATAA AGGGATCTTA GATACTGAGC 840 

AATTTGGTTT ATCAGTGATG GTCGCATTTG GCTACAGACA ACAAGAGCCA CCGAAAAATA 900 

AAACACGCCA AGCTTATGAA GATGTTATTG AATGGGTTGG ACCAAAAGAA TAAATAGAAT 960 

. ACCGTATGTC TAAATATATA AAATTAAAAA GTTAGCAATA AAAAAGCCTG CGATTACATA 1020 

AATGAATCGC AGGcTTTTGC GTGAAAAAAT TGTATTAATA AAGTATGGAT GATTATTTTT 1080 

CTGGSACAAG GTCAGTATTT GAATGAACTG TGATGTCAAA CCCTTCTGGT GCCGTAAATG 1140 

TATGTGTTGA GGCGTCGGGT TGATAAATAT CAACATGTGT TAATCCATAA CTTTGTGAAT 1200 

TGTTTTGTCT TGCTTGATTG GATTGCCAAG TATTAGCAGC AATATGATGG TGATAATGAT 1260 

TCGTTGACAT AAATAGCGCA CGTGGAAAAT CAGACACATG TTGGAATCCT AATTGTTCAA 1320 

TGTAACATTG ATATGCTGCG TCTAAATCAT GTGTTTTTAA ATGTAAGTGT CCAATCATGC 1380 

CTTTTGCTGG CATTCCTTGC CAACCTTCAT CAGTACGATG TGTTAATAAG GTTTGGCTAT 1440 

CAACTTCTAA AGTATCCATT TTAACTTTGC CATTTTGCCA TTCCCATGAA GATGAAGGTC 1500 

TATCGCGATA GACTTCAATA CCATTACCTT CGGGGTCGTT GAAATATAAA GCTTCACTTA 1560 

CTAAATGATC ACCAGCGCCG ATGCCCATAT TTTTTTGTGC CACGAAATAT AAGAAGTTAG 1620 
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aAGTCTGACG GcCGTCTTCT AATAAATGTA ACGTTAGAGT ATGGcCACCA GTCCCAACAG 1740 

ATAATACGGT TGTATTATCG TCAGAACTTT TAACGGATAG TCCTAAAATG TTTTTGTAAA 1800 

ATGTTGTCAT TAAGTCTAAG TCTCTTACGT TCAGTACAAT GTTTGTCACT TGTGTTGCTG 1860 

TTTTATCGTG AAATGCCATT ATGCATCGCC TCTTTTTCTA TTTTTCTATA AGTTAGTATA 192 0 

AAAAGTATAC CAGAAAAGAA AATGAATTGA TAGCATAAAG TTTGAAATGC AAAATAACTA 1980 

GTCGTTTTGC AATTTTAt AT TGATGCGAAC AAAAAAGCGA TGGTACAGTT GCACCATCGC 2040 

AAAATTTATT TAACCAAGAT ATACATCTTG ATATGAATCT TCTTTTTCTA ACATATGTTT 2100. 

GGCAAATGAA CATGAGGCAA TAATTTTCAA ATTATTTTCT CGAGCGTGTT CAACAACTGc 2160 

TTTAAGTAGT TTTTTGCCAA CACCTTGACC ACCAAGTTCA TCAGATACGC CTGTATGATC 2220 

AATGTTAATT TCATTATTAT CCACAAAACG GTATGTGATT TCAGCTAAAG CATTATTTTC 2280 

20 ATCATCACCA AT AT AGAATT TGTTCTCGCC TTGTTTGATT TCAAGGTTAC TCATACATAT 2340 

CAACTCCTAT CATGATTGAT TATAGTATTT CCCTATTCTA TTTTAACTTA AACGAAGTCA 24 00 

AAGGTGCATG ACAGTCATGT GACGACATTG CCACATCTAT GTAGTCGTTT TTATTAAGCA 2460 

25 

CAGTTTGAAA TGAAGATGAA AACACGTATC TTGACATTAA ATCTATTCAG CTATATAATT 2520 

TATCTCGAAA TCGAAATAAA ATAAAAAAGT TGGTGATCAT ATGGATCGAA CGAAACAATC 2580 

TCTCAATGTT TTTGTCGGAA TGAATAGGGC GTTAGACACA TTAGAGCAAA TTACAAAAGA 2640 

30 

AGACGTAAAG CGATATGGCT TAAATATTAC TGAATTTGCA GTGCTCGAGT TGCTTTATAA 2700 

TAAAGGTCCG CAACCAATTC AACGTATTAG AGACCGCGTA TTAATTGCAA GTAGCAGCAT 2760 

TTCATATGTT GTAAGTCAAT TAGAGGACAA AGGTTGGATT ACACGTGAAA AGGATAAAGA 2820 

TGATAAACGT GTATATATGG CTTGTTTAAC TGAAAAAGGT CAAAGTCAAA TGGCAGATAT 2880 

TTTCCCTAAG CATGCTGAGA CATTAACAAA AGCGTTTGAT GTGTTAACAA AGGATGAATT 2940 

AACAATCTTA CAACAAGCGT TTAAGAAACT AAGTGCACAA TCTACAGAAG TGTAAGGCGT 3000 

GCACTAAAAA TTTACATTAA AGTATCTCGA TTTCGAGATA AATGCACTAA AAATATAAAG 3 060 

AGGGTATATA AAATGATAAA TAATCATGAA TTACTAGGTA TTCACCATGT TACTGCAATG 3120 

ACAGATGATG CAGAACGTAA TTATAAATTT TTTACAGAAG TACTAGGCAT GCGTTTAGTT 3180 

AAAAAGACAG TCAATCAAGA TGATATTTAT ACGTATCATA CTTTTTTTGC AGATGATGTA 3240 

GGTTCGGCAG GTACAGACAT GACGTTCTTT GATTTTCCAA ATATTACAAA AGGGCAGGCA 3300 

GGAACAAATT CCATTACAAG ACCGTCTTTT AGAGTGCCTA ACGATGACGC ATTAACATAT 3360 

TATGAACAGC GCTTTGATGA GTTTGGTGTT AAACACGAAG GTATTCAAGA ATTATTTGGT 3420 
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TTAAATGAAG GGGTAGCACC TCGTGTACCT TGGAAGAATG GACCGGTTCC AGTAGATAAA 3540 

GCGATTTATG GATTAGGCCC CATTGAAATT AAAGTAAGTT ATTTTGACGA CTTTAAAAAT 3600 

ATTTTAGAGA CTGTTTACGG TATGACAACT ATTGCGCATG AAGATAATGT CGCATTACTT 3660 

GAAGTTGGCG AAGGAGGCAA TGGTGGCCAG GTAATCTTAA TAAAAGATGA TAAAGGGCCa 3720 

GCaGCACGTC AAGGTTATGG tGAGGTACAT CATGTGTCAT TTCGTGTGAA AGATCATGAT 378 0 

GCAATAGAAG CGTGGGCAAC GAAATATAAA GAGGTAGGTA TTAATAACtC AGGCATCGTT 3 84 0 

AATCGTTTCT ATTTTGAAGC ATTATATGCA CGTGTGGGGC ATATTTTAAT AGAAATTTCA 3 900 

ACAGATGGAC CAGGATTTAT GGAAGATGAA CCTTATGAAA CATTAGGCGA AGGGTTATCC 3960 

TTACCACCAT TTTTAGAAAA TAAAAGAGAA TATATTGAAT CGGAAGTTAG ACCTTTTAAT 4020 

ACGAAGCGTC AACATGGTTA ATTGGAATGA GGAGGATTTG TGATGGAACA TATTTTTAGA 4080 

20 GAAGGACAAA ATGGTGCGCC AACACTAATA TTATTGCATG GTACAGGTGG TGATGAGTTC 4140 

GATTTATTAC CGTTAGGCGA AgcATTGAAT GAAAATTATC ACTTGTTAAG TATTAGAGGA 4200 

CAAGTTTCAG AAAATGGGAT GAACCGTTAT TTCAAACGTC TTGGTGAAGG TGTTTATGAT 4260 

25 

GAAGAAGATT TGGCATTTCG TGGACAAGAA TTGTTGACGT TCATTAAAGA AGCTGCTGaA 4320 

CGTTATGATT TTGaTATTGA AAAAGCAGTA CTTGTTGGAT TTTCAAATGG ATCAAATATA 43 80 

GCGATTAACT TAATGTTGCG TTCAGAAGCA CCATTTAAAA AAGCATTGTT ATATGCACCG 444 0 

TTATACCCAG TTGAAGTAAC GTCAACAAAG GATTTATCAG ATGTCAGTGT GTTGCTTTCT 4500 

ATGGGGAAAC ATGATCCAAT TGTGCCATTA GCTGCAAGTG AACAAGTCAT TAACTTGTTT 4560 

AATACACGTG GGGCACAAGT CGAAGAAGTT TGGGTGAAGG GCCATGAAAT TACAGAAACT 4620 

GGATTAACGG CTGGTCAACA AATACTTGGG AAATAACAGT TCTATTAAGA AGCGGACAGA 4680 

TGGAAAAGAT TTTTACTTTT CATCTGCCCG CTTTTTTGAT TTTGAAGTGC TGTACTAAAT 4740 

TTTACAATAG TATAGATATT TTAATCGATA TGAGATTTGC CGGTAATACG CTTAATTAAA 4800 

CCTTTATAGA GTACAGGTAT GAGTAAGATG AAACCGAACA ATCCCATAAT AGGGAATACt 486 0 

TTTCCAATTA ATGAAATGAa ACCGATAAAT GTACTAATAT AAGTGATGAC AGCCATTGTA 4 920 

ATAATAATGA TGAAGTAACG TCTGCTGAAT GGAACGCTGA AACGTGACGC AAATGCATAC 4980 

ATTAATCCAA CAACAGTATT GTAGATGACA AGT AT CAT AA TGACAGACAT AATAATACCA 5040 

ATTGACGGAG ACATTTGTGT CGCTAATTTT AATGTAGGTA GATCTACGTG TTTAATTTTA 5100 

TCGAATTGAG AAATTAAACC TAGATTAATC ATCATGAGTA AAAATGTAAT GATTAAACCG 5160 

CCAATCAAGC CCCCGTATAA CGTTGAGTCA CGATATTTAA CTTTACTACC CATCACTGAT 5220 
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10 



15 



20 



25 



30 



35 



CCAGGTGATA 


ATGATTTCTG 


CTTATGAATC 


TGAGCATCAT 


TATTAGCGGC 


AGTAAAATCA 


5340 


AGATGACTTG 


TTGTGAAATA 


GTAGACCGCA 


ATCATAATGA 


CAATCGCAAT 


TAAAAATGGG 


5400 


GTAACACCGC 


CAAGCACAGC 


AATTAAACGA 


T C GAATTTT A 


GAAACAGTGT 


TGCTAAAATA 


5460 


AAGGCGACTA 


ATATGAGTGC 


GCTCAGCCAA 


TACGGTAAGT 


TGAAACTTTG 


ATGAATGGTT 


5520 


GACGCACCAC 


CTGCAGTCAT 


AATAATAGCT 


AAAGACAACA 


TAAACATTGT 


TAAAATAATA 


• 5580 


TCAAAACCTC 


TTGCAATAGA 


GGGGTATAAG 


AAATAGTTAA 


TTGAATCAGA 


ATGATTTCTG 


5640 


GACTTTAGAT 


GATGACCTGT 


ATGCATGACA 


ACCATTCCAC 


CTAAAGTAAT 


CAATAGTCCT 


5700 


fSTTACAATAA 

V3l XAV«AAXAA 


TGCCTGAAAT 


GCTATATGCG 


CCATGACTTG 


TGAAAAACTG 


GAAAATTTCT 


5760 


lu/iV* \-AVJ X AVJ 


CAAAGCCGGC 


ACCAACGACA 


ACACCAACAA AGGCAAATGC 


CACAATAATG 


5820 


utAl. 1 1 1 A 


AfiATAfYSpAT 


<<JAX X a.aaaaa 


TGTCCCTTCG 


TAATTTTAAG 


TAATATAGAA 


5880 




ACATY3TTAAT 


GAAAAATATA 


GTACTAATAT 


AGTATTTTGT 


TAAATTGGAG 


5940 




U\9 X W X \— w\J X V» 


AX X IVfAl xaa 


TTTATTAGTT 


GATTTTGCAT 


TTTTTTGCTG 


6000 


TAAAGTTGTT 


A x AA X x 




TAGCATAGAT 


ACACCAATCC* 


CCTCACTACT 


6060 


L.^0AA 1 AVJ 1K3 


A<t?V7V7V7Al 111 


XXX *—vjV>J i*J 1 1\ 


GCTAGGTCGC 


CTATTTATCA 


TCGTGTTTG C 


6120 


Vj I Ay L-cLfV HaL. 


1 AAA*-A\«A<*J 


TaPPAfTAAA 

X AC V^rtV* X AAA 


TAAGTGCACG 


ATACATGCAT 


CAAATGTCGT 


6180 


CTTTAGTcTA 


AGTAACGATC 


ATGCATTAAC 


ATTTTCAAAA 


TATCTATTTG 


AGCTTGAAGA 


6240 


TCTTTACCAA 


TATTGGTATC 


ACGAATCTTC 


TTACGTTGTA 


ATTCTTTATC 


TACGACGCGC 


6300 


TTTATAGAAA 


GTTCATCGAT 


ACCTTCGGAA 


AGTATTTTTn 


CTTTAGCGTT 


AAATTGTTGG 


6360 


TGTGCAACGA 


GTTGCATACC 


GAATGAATTA 


TACAATAGTG 


TATAGCCTGC 


AATGCCAGTn 


6420 


GTTGACTGAT 


AAGCTTTTGA 


AAAGCCACCA 


TCAATGACAA 


GCATCTTTCC 


ATCAGCCTTG 


6490 



T^T r 6482 
40 (2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16592 base pairs 

(B) TYPE: nucleic acid 
4 5 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

50 

ATTTAAGGCG ATTGCTTGTG TATTTCTCTC TTTTGTAGGC AAACCTGCAC TCGTTCCAAA 60 
AAATGTAACT TCCATATATG CCCCTCCTTT TCTTCAATTC ATTTTATCAT AAAATTTGTA 120 

55 
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AATTTTT CT A ACTTTAACGT AGACATAACT 
TCATTAATAA GTATCACATT AAACATGATA 

5 

ATACAGTCGA GCATATTGTA TGACCTACTG 
ATCTAATTAA GAATTGAGGT TTTAATCTTG 
ATCACAATCT TTGGTGCACT GCGTGACTTA 

10 

CATCTCTACC AACAAGACAA TTTAGATGAA 
GACATkwnTA ATGATGATTT CCGTAATCAA 

j5 GATACAAACA AAATTGACGC GTTTATGGAA 

AATGAAGAAA GCTATCAAGA ATTACTAGAT 
TTAAAAGGTA AT CG ACTATT CTATTTAGCA 

20 GATTATCTAA AATCTTCTGG TGTTACTGAT 

AAACCATTCG GTAGTGATTT AAAATCAGCC 
TTTAAAGAAG AAGAAATTTA TCGTATTGAC 

25 

ATCGAGGTAT TACGTTTTGC GAATGCGATG 
TCAAACATCC AAGTTACATC TTCTGAAATA 
GAATCAAGTG GCGCGCTAAA AGATATGGTG 

30 

TTAGCTATGG AAGCACCTAT TAGTTTAAAT 
. GTACTTAAAT CACTGCGTCA TTTCCAATCT 

3S CAATATGGCG AAGGCTATAT CGATGGTAAA 

GTTGCAGATG ACTCTAACAC ACCTACCTTT 
AGATGGGCTG GTGTACCATT CTATATTCGT 

40 CAAGTTGTCG TTGAATTTAA AGAAGTACCA 

GTTAGATTCA AACCTATTAG TAATCAATAT 
CtAAATGcTA AGaAAAATAC ACAAGGTATC 

45 

ATGaGCGcTC aAGaTAAAAT GaATACTGTA 
CTTAAAGGTG ATGCCACTAA CTTCACGCAC 
GTTGATGGAA TTCAAGATGA ATGGAATATG 

50 

GGTACTAATG GTCCATTAGA AAGTGATTTA 
GGACGATATT CAATAATTGA ATTAAAACGC 

55 



ATATAAATTT TGATAATTAC GTTATACTTA 240 

CATGAATCGA TATTTCATTT AAGACACTGG 300 

AATGGATTAT CTTATAATAA TAAATCATAT 360 
AGTACTAAAA ACAAACACAT CCCATGTTTA ' 420 

AGCCATCGTA AGTnGTTTCC ATCAATATTC 480 

CAT ATTGC CA TcATCgGTAT TGGACGTCGT 540 

GTAAAATCAT CAATTCAAAA GCACGTAAAA 600 

CATGTCTTGT ATCATAGACA TGATGTTAGT 660 

TTTAGTAATG AATTAGATAG CCAATTTGAA 720 

ATOGCACCAC AATTCTTTGG CGTTATTTCT 780 

ACAAAAGGAT TTAAACGCCT TGTTATCGAA 840 

GAAGCATTAA ACAATCAAAT TCGTAAATCA 900 

CACTATTTAG GAAAAGACAT GGTTCAAAAT 960 

TTTGAACCAT TATGGAATAA CAAATATATT 1020 

CTAGGTGTTG AAGATCGTGG TGGTTATTAT 1080 

CAAAACCAGA TGTTACAAAT GGTTGcATTA 114 0 

AGTGAAGATA TCCGTGCTGA GAAAGTAAAA 120 0 

GAAGATGTTA AAAAGAACTT TGTTCGTGGT 1260 

CAAGTTAAAG CATACCGTGA TGAAGAT CGC 1320 

GTTTCAGGTA AATTAACAAT TGATAACTTT 1380 

ACTGGTAAAC GTATGAAATC TAAAACAATT 144 0 

ATGAACTTAT ACTATGgAAA CTGaTAAACT 1500 

CCAACCTAAT GAAGGTGgTA TCTTTtACAT 156 0 

gAAACAGrAC CTGtCCmATT GtCTTACTCm 1620 

GATGCATATG AAAATCTATT ATTTGATTGT 1680 

TGGGAAGAAT TAAaATCAAC ATGGAAATTT 174 0 

GTTGaTCCAG AATTCCCTAA CTATGAATCA 1800 

CTACTTGCTC GTGATGGTAA CCATTGGTGG 1860 

ACATGTTAAA CAAAAATAAA TGAGCGAATG 1920 
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TATATTATGA 


AATTATATTT 


TACAATGCCC 


AAAACTATTT 


TAATAATCAT 


TGAACAAATG 


2040 




GGTGTATAAT 


TTATAGAAAT 


AATGTAGAAT 


AAAAATAAAT 


GATTGAATTA 


ATTGGAGTGA 


2100 


s 


AAGTTTTGGA 


CGTTATCAAG 


CAAATACAAC 


AGGCAATTGT 


TTATATTGAA 


GATCGTTTAT 


2160 




TAGAGCCTTT 


CAATTTGCAA 


GAATTAAGTG 


ATTACGTTGG 


TGTTTCGCCA 


TACCATCTTG 


2220 


10 


ATCAATCATT 


TAAAATGATT 


GTCGGCTTAT 


CTCCAGAAGC 


TTATGCACGC 


GCGCGTAAAA 


2280 


TGACACTCGC 


TGCAAATGAT 


GTGATTAATG 


GTGCTACACG 


ACTTGTAGAT 


ATCGCTAAAA 


2340 




AATATCACTA 


TGCAAATTCA 


AATGATTTTG 


CAAATGATTT 


TAGTGATTTT 


CACGGCGTAT 


2400 


15 


CACCTATTCA 


AGCCTCTACT 


AAAAAAGATG 


AATTACAAAT 


TCAAGAGCGA 


TTATATATCA 


2460 




AATTATCAAC 


TACTGAGAGA 


GCACCTTATC 


CATACAGATT 


AGAAGAGACA 


GATG ATATTT 


2520 




CATTGGTTGG 


ATATGCACGA 


TTTATAGACA 


CTAAGTATTT 


GTCACATCCT 


TTTAATGTTC 


2580 


20 


OGGATTTTTT 


AGAAGACTTG 


CTCATTGATG 


GTAAAATTAA 


AGAGTTACGA 


CGATATAATG 


2640 




ACGTTAGTCC 


ATTTGAACTA 


TTTGTTATTA 


GTTGTCCTCT 


TGAAAATGGT 


TTAGAAATAT 


2700 




TTGTAGGTGT 


ACCAAGTGAA 


CGTTATCCTG 


CACACTTAGA 


AAGTCGATTT 


TTACCTGGCA 


2760 


25 


AACATTGTGC 


GAAATTCAAT 


TTACAAGGTG 


AAATTGATTA 


TGCAACTAAT 


GAAGCTTGGT 


2820 




ACTATATTGA 


ATCAAGTTTG 


CAGTTAACAT 


TG C CAT ATG A 


ACGAAATGAT 


TTATATGTTG 


2880 


30 


AAGTGTACCC 


TCTCGATATT 


T CATTT AATG 


ACCCATTCAC 


TAAAATTCAG 


CTTTGGATTC 


2 94 0 


CTGTTAAACA 


GAGTCCTTAT 


GACGAAGATT 


AAATAATAAA 


AAACAAAGAA 


GCCCCCTAAT 


3000 




ATATCTATAG 


GTGTACAAAT 


GG CCTTAGAT 


TCTATTAGGG 


GGCATATTAA 


TATGTTAATT 


3060 


35 


TAGTTCGATA 


ACACATGCTT 


CATATGGACG 


TAACTGTTTT 


AAATTAACTT 


TGGCATCATA 


3120 




ATTAAATAGC 


TTTACTTCTC 


CATGGCTTAA 


ATCAAATGGT 


ACAGTTAATT 


CTGCTTCGTG 


3180 




GTTAGTAAGA 


TTACCTACAA 


TAAGAACTTG 


CTTTTCATTT 


AATGTTCTCG 


TGTACGCAAA 


3240 


40 


AACTTGTGAA 


TTTTCAGCAT 


CTACTAAATC 


AAATTGACCA 


TATACGTAT A 


CATCATTAGA 


3300 




CTTTCTTAAT 


TGAATTAAAT 


CTTTATAAAA 


TTGTAATACT 


GAATGCT CAT 


CTTCTAATTG 


3360 




TTGTGCAACA 


TTGATAGTTT 


TATAATTCGG 


ATTCACTGGG 


AACCACGGTT 


* CACCATTTGT 


3420 


45 


AAATCCTCCA 


1 1 iAALAjiAl 






CGAGAATTAT 


CTCGGTTCT C 


3480 




ATCTTTATAT 


TTCGCAAGTA 


AAGCGTCTAC 


ATCTCCACCT 


TG AG CTTTCA 


CTATTTGATA 


3540 


50 


GTCATTTTTA 


ACAGCAACAT 


CGTTAAACGT 


TTCAATACTT 


TCAAATGGAT 


AATTCGTCAT 


3600 


AC CAATTT CT 


TGAC CTTGAT 


AAATGAATGG 


CGTACCTTGT 


TGCAAGAAAT 


AAAGAGCTGC 


3660 




ATGACTTGTT 


GCTGATTCAT 


ACCAATACTT 


GTCATCGTCA 


CCCCACGTCG 


ATACACGTCG 


3720 
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CCATCTATTT AATACAGATT TATACGAATT TACATCAAAG TGAGAATCAC CACTATTCCA 3840 

CAGTCCCAAA TGTTCAAATT GGAATATCAT ATTAAATTTA CCATTTTCTT CCCCGACCCA 3900 

GTCATCAG CA TCATCAGGGC TTACACCATT CGCTTCACCA ACAGTCATAA TGTCATACTT 3 960 

ACTTAATGAG CGATCTTTCA TCTCTTGTAA CCAAGTTTGT ATACCTGGCT GATTCATATC 4020 

TACATCAAAT GCTGGGGCAT ATGTTTTACC CTCAGGTACA GGTAAGTCAC CCGCTTCAAA 408 0 

CGTCTTCTTA ATATGCGTAA TTGCATCTAC TCTAAATCCA TCAATGCCTT TATCAAACCA 4140 

CCAGTTCATC ATTTCAAATA CAGCATCTCT AACTTCCGGA TTACCCCAAT TCAAATCAGG 4200 

TTGTTTTTTA CTGAATAAAT GGAAATAATA TTGCTCAGTA TTAG CATCAT ATTCCCATGT 4260 

AGATC CATTA AATATACTTT CCCAGTTGTT AGGTTCAGAG CCATCTGGCT TTGGATCTTG 4320 

CCAAATGTAC CAATCACGTT TGGGATTGTC TTTACTAGAT TTGGATTCTA TAAACCAAGG 4380 

ATGTTCATCA GATGTATGAT TTACAACTAA ATCTAAAATA AGCTTCATGC CTCTATCATG 4440 

AACACCTTTT AATAAACGAT CAAAGTCTTC CAT CG TTCCA AATTCATCCA TAATCTCTTG 4500 

GTAGTCACTA ATATCATAAC CATTGTCATC ATTAGGTGAT TTAAACATTG GACTGAGCCA 4560 

25 AATGACATCG ATACCGAAAT CTTTTAAGTA GTCCAATTTA TCAATCATTG CAGGTAAATC 4 620 

CCCAATACCA TCGTGATTAC TATCATTAAA ACTTCTTGGA TATACTTGAT ATGCTACTGC 4680 

TTCTTTCCAC CATTGCTTAT TCATTTTAAA ACTCCTTTGC TATCGCTGTG TTGATTTTCT 474 0 

30 * . . - 

TATTTTTAAT TCTGTATCTA TAATGACGAG TTCAATAACA TCCTGTGCTT TGTTTTTCAA 4800 

TATATTTAAA ATTGCTGCAC CAGCCTGTTG ACCTAACATT CGAGGCTTGA TGTCAATACA 4860 

GGTTTGTGGT GGTGACGCAA TTTCGGTTAA ATAAGAATCA TTGAACGTTG CTGTCATTAC 4920 

35 

ATCTTTCGGA ATTTCAATAT TAAGTTCATA TAGGACACTT AAAATCGCTA AATGTAACAT 4 980 

AGCATCTAAC GAAATGATTG CCTGTTTAAT ATTTGGGTCC TTCAAACGCG TATGTAGATT 5040 

40 TTGCATGTAA TTAAAAATAA CTTCTCTTTC ATTACTAGTC TCAATAATTT GATAATTAAT 5100 

TTTATTTTGA GAAGCTATCG TTTCAAATCC TTGAATTCTA TCTTTTGAAA CTTCAAAATT 5160 

TCCTTTTTCT GTAATAAATA TT AATT CAT C TACACCTTGT TCAATAACAT GTCGTGTCAA 5220 

45 ATTTTCAGAA GCTAATATAT TATCATTATC TATATGTGTA AATTGATGAT CTATATCCGA 5280 

TGTAGGCTTA CCAATCACAA T AAATGG CAT GCTTTCATCA ATTAACATTT GTTTAATCGG 5340 

ATCATTTTCT TTTGAATAGA GCAGTATAAA CGCATCAACC ATTCGTTGTT TAATCATTTT 54 00 

ATAAACTTCA TCCATTAAAT CATTCATATT ATTTGAGACT GTCGTTTGTG TACCATAGCC 5460 

ATGCTGGTTA CACGTTTCAG AAATTCCTAG CAATACATTG ATGTAGAATG GATTCAGTCG 5520 
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1 1U1 .r\vj\_*f-\ 


GCGGTATTAG 


GAAAATAATT 


CAATTCTTC C 


ATAACTTTCT 


TCAGTTTTGA 


5640 




AATTGTCGCT 


TCGCTAATAC 


GTTGATTTCC 


TTTTATAACT 


CTTGAAACTG 


TCGAAGGAGA 


5700 


5 


AACACCGGCT 


TTTAGTGCAA 


CATCTTTAAT 


CGTAACCATT 


TAATCACCTC 


CTGTT AATTT 


S760 




CTGCATCGGA 


AAACGCTTCC 


AACCACTGTA 


TAATACCAGT 


TTAGTCACAC 


TTTCTAAAAA 


5820 


10 


AGTCAAAAGA 


TTTGTGCAAA 


CGATTGCATA 


AAACGATAAA 


AATAAAACCT 


TCATACTGAA 


5880 


ATTCAATCCG AAAATCAATA TAAAGGTTTG TATAAATATT AAAATCGATT 


GTTTAGTCAC 


5940 




Ta a erne a a a 


ATAGTTACCT 


TGGCCATCTT 


GAAAATTAAA 


TACACGTTGA 


CCATTCATTT 


6000 


15 


- V_ X AVv X A X -I V- 


ATGC CCAGTT 


AAACCTAAAT 


CATTTAATTT 


TGAGTATAAT 


GCATCAAAGT 


6060 




X 1 i -I X 


A A At"* ATT AAA 


GATGGTGTTC 

Wsf% X\J\J X«J X X 


CTAGGTTCAC 


TTCCGGGCTA 


TGt "T"T u T"T'f^ A A 

x v# x x x x * * 


6120 




rr> ^ T^T*i''""*T*T^T 1 


TYt! r*P AT A ATP 


GTGAATGACG 


TTTCAGCATC 


TTTGGTAGGT 


GATACXTCAA 


6180 


20 


i\?t_Aft\_J\X A 


\j X v»V_ X LnuL x 


n£\\*\3\3 X \J xxx 


CACTTACAAC 


AACAAATTCT 


AAAGTTTCTG 

/VWJ XXX X Vrf 


6240 






xxx Vp*vj^ x x x x 


TCGACATCAT 


CAACATATAA 


CATAACTTGA 


TTTAACTTTT 

XXX iulV_ X X X X 


6300 






CYT A rf*T r*T AT 


TTGTGTATAG 


TACATGCTAT 


CATAACACAG 


TAAATATTTT 

XfWAX^^X X X X 


6360 


25 




AAAATG CTTA 

/VwVVXWV^X xrt 


AAAATATGGC 


GGGATGCTTT 


TAAGGTCAAG 


GATAATACTT 


6420 






TTATAGGTTG 


TAGCTACTCT 


ATCACACTCT 


CTTTTATATT 


TATCAAAAGA 


6480 


30 


TATAAAAAAG 


GATAGTATCT 


TTCAACTATC 


CTTTAATCAA 


TATTATTCTT 


CAATCCATTG 


6540 


TGTATGGAAT 


ACGCCtTCTT 


TATCTTTTCT 


TTCGTACGTA 


TGAGCACCGA 


AGTAGTCACG 


6600 




TTGTGCTTGA 


ATTAAGTTTG 


CAGGTAAATC 


AGCAGCACGG 


TAACTATCAT 


AGTAATTAAT 


6660 


35 


ACTTGATGAG 


AAACCAGGTG 




ATTTTflAArA 
Hill IVjrtrtk-J-l 


C C AGTTGCG A 


CAACATCACG 


6720 


TAACGCATCT 


TGATATTCAG 


TAACGATGTT 


TTTAAAGTAA 


GGATCTAGCA 


ATAAGTTTTG 


6780 




TAATCCTGGA 


TTATTATCGT 


AAGCATCTTT 


GATCTTTTGT 


AAGAATTGTG 


CACGGATAAT 


6840 


40 


GCAACCTTCT 


CTCCAAATCA 


TAGCTAAATC 


ACC^AGTTTT 


AAATTCCATT 


CATTATCTTC 


6900 




ACTTGCTTTA 


CGCATTTGcG 


CGAAACCTTG 


TGGATAAGAA 


CAAATTTTAC 


TCATATATAA 


6960 




TGCTTTACGA 


ATOTTTTCTA 


AAAAGTCTTT 


CTTGTCACCA 


TCAAATGATG 


CTTTTGGACC 


7020 


45 


ATTTAATTCT 


TTAGAAGCAT 


TTACGCGCTC 


TTCTTTGaTT 


GAAGAGATAA 


AACGTGCAAA 


7080 




TACAGATTCA 


GTAATGATTG 


TTAATGGAAT 


ACCTAATTCT 


AATGCGTTAA 


TTGAAGTCCA 


7140 




TTTTCCTGTA 


CCTTTTTGaC 


CTGCAGTATC 


AAGAATTTTT 


TCAACTAATG 


CTTCTTTATT 


7200 


SO 


TTCATCTAAT 


TTCATGAAAA 


TATCACCAGT 


GATTTCAATT 


AAATAACTTT 


CTAATTCACC 


7260 




AGCATTCCAG 


TCTTTGAACG 


TTTGAGCAAT 


GTCTTCATGA 


GACATGCCTA 


ATAATTCTTT 


7320 
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CATTTTCACA TAGTGTCCAG CACCATTAGG TGCAATATAA GTAACACATG AAGCACCGTC 7440 

TTTTGCCTTT GCAGCAATTG CATGAAGAAT ATCTGCAACT TTGTTATAAG CTTCTTGTTG 7500 

TCCACCCGGC ATT AATGACG GACCAGTTAA CGCTCCAATT TCACCACCAG AAACGCCCAT 7560 

ACCAATAAAG TTGATTGCAC TTTGTGywAA TGCTTTATTA CGTCTGATAG TATCTTGATA 7620 

GTTTGTATTA CCACCATCAA TTAAAATATC TCCATCATCT AATAAAGGTA ACAAACTATC 7680 

AATCGTTGCG TCCGTAGCTT TACCTGCTTG AACCATTAAT AAAATTTTAC GTGGTTTTTC 7740 

TAAAGAATTA ACAAATTCTT CCAATGAATA CGTTGGATGA ATATTTTTCC CTTTTGATTC 7800 
TTCAACCATT AAATCAGTTT TTTCACTTGA GCGGTTAAAT ACAGATACAC TATATCCGCG . 7860 

TGATTCAATA TTCCAAGCTA GGTTTTTACC CATAACGGCT AAACCAATAA CTCCAATTTG 7920 

TTGTGTCATA TTACTTACCT CACTTGTTGA TTTTTCATTA GTATTGTATC ACAAAATAGA 7980 
20 CATACACTAC ACTAAATCAT TTCG AATGTC GCGCAACTAT TTTGATTATT TCTAACACTT . 8 04 0 

GACTTGCAAG CAAGTTCAAT GATTTAATCG GCATTCTCTC ATTTGTTGTA TGGATTTTTT 8100 

CATAACCCAC TCCTAAAATG ACTGAAGGAA TACCAAATGT ATTAATAATA CTGCCGTCTG 8160 

25 AACCGCCACC AGAAATAATT GTATTTGGAG ATAATCCTAA ATTACGAGCA CTTTCTTGTG 822.0 

CAATTTTAAC AACCGCTTCA TTATCATTAA TTTTAAATCC TGGAfAACTT TGCTCCACTG 8280 

TAACTACTGC TTTCCCACCT AATTCTGATG CAGTAGTTTC AAACACATGA GTCATATGTT 8340 
TGACTTGTGT TTTTATTCTT TGTGGATCGT GAGAACGTGC CTCTGCTTCT AAAATGACTT . 8400 

CATCTGCAAC AATATTCGTA GCTGAACCGC CATGAAACTT ACCAATATTG GCAGTAGTTA 8460 

TTTCATCAAC TTGTCCTAAT TTCATTCGAC TAATTGcTTT CGCCGCAATA TTAATAGCAC 8520 

TAACACCCTC TTTTGGCGTA CTTGCATGAG CCGTTTTGCC AAAAATTTTA GCTGAAATTA 8580 

ACATTTGCGT CGGTGCACCT AGAACCGTAG TACCGACATC AGCACTTGCA TCAATAGCAT 8640 

AACCAAAGTC CGCGTCCAAG AACTCTGAAT TTAATTCTTT AGCACCAATT AAACCTGATT 8700 

CTTCTCCAAC AGTAATCACA AATTGAATTT GTCCATGTGG GATTTGTTGT TCCTTTATCA 8760 
CTTGCAAAAC TTCAAGCATC GCTGATAATC CTGCTTTATC ATCTGCACCT AGAATAGTCG . 8820 
45 TACCATCAGA GTATATGTAG CCGTCATCTT TTACAATTGG CTTTACATTA ATTGCGGGTA * 8880 

CAACAGTATC GATATGGCtC GTCAAATATA ATTTAGGTAC TTCGCCTTCT TCGATAGTAC 8940 
TATTCATTGT ACACACTAGA TTATTGGCAC CTAATTTAGG ATGTTTAGCC GCTTCATCTT . 9000 

SO CTTTAACATC TAACCCTAAT. GCTATGAATT TTTCTTTTAA AATAGGTTGG ATTGTTGATT 906 0 

CATTCCCTGT CTCAGAATCG ATTTGTAGAA GTTCAAAAAA CGTATTAAGT AATCTTTGCT 9120 

55 



30 



35 



40 



408 



EP0 786 519 A2 



10 



is 



20 



25 



35 



40 



45 



50 



GATGAAATAA AATGTTACAG TAATTGACGT TACACAGATT TATCAGGTTT GTAAATTGTG 9240 
TCATATTATT TTCAATTTAT. TATATATAAT TATTGTAACT CAAACTAAGC TTTGTCAAAA . 9300 

ATATATTGAT TGATTTTTCA AAGATATCGT ATAATGAGGA AAATGACATA AGCAAACTTA 9360 
CTCATGTTTT TTATTATATT CCTTTATGAT GATTGCTAGT TATATCGTCT CAAGTTAAAA . 9420- 

GTTTTATATC TTATGTCGTA ATTATTAATA CAAAGGTTAT TCATTTGGAG GCACACAAAA 9480 

TGCAAAATAA AGTTTTAAGA ATTATCATTA TCGTTATGCT TGTATCAGTT GTATTAGCAT 9540 

TGTTATTAAC GAGTATCATT CCAATTTTAT AAAGTATATC TCAACTACCT ATACAAAATC 9600 

ATACAATTAA AAATCCATCC ATTATAAACG CATGTATTAA TAAGTTATCG TATTGCAACG 9660 

ATTACTTTCA AACATGGGTC ATACGGATGG ATTATTTTTT AAGCTACTTC ACTATGCATT 9720 

TTCAATGAAC CAAATTGCGA TTTGATTTGT AAATATTCTT CTAATTCATT TAATATTTGA 9780 

ATAATACTTG CTCTCGAGTT AAGCGCTTTG TGTGTTGTTG GCAATGGCAG TTCATCCAAT 984 0 
TTCAAACGCG TCTCATACAA ATTGTGTAAA CGCATTGCTG TATAGTCATT ACTATTCACA - . 9900 , 

TTTAGACCAA TTTCTTTCAG CAGTGACGCA ACATCATTTA AAAGCGGATC TTTATGACAG 9960 

AT ACTTT CGA TGAGCGGTTT CATTCTCATT AACAATTCCA CTTGCTCTTC TCGCATATCA 10020 

AAATAATGAT AGTATGAATT TTCGTTTCTA ACAAAATGAT TTTTAACATC TCGGAACGCG 10080 

ATAGACTtCG CCTTTTTAAT ATTTAAAAGT AACACTTCAA ATTCAATCGC AATGGTATCT 10140 

TCATATTTTT CACAAATATA ACTATATTTA CTAAAAATAT CAGCAATTTG TTGCTCAATT . 10200 

TTACATTTGT ATTCGTCtAG TTGTTTGTCT AAACTTGGCA TCATTAAATT CaTTGTAAAT 10260 

GCAATGCTTA GTC CAATTAA CAGTAATAAT GTTTCATTAA CAATTAAATG TGCATCAATT .10320 

GATTTTG CAT TAAAAACATG AAGTAATATA ACGCAACTCG TAATGACACC TTCTTGTACT 10380 

TTTAATACGA CAGTT AATGG TATAAATAAC AATACGATAA TACCGAGTAC AATTGGACTC . 10440 

TGACCTAATA AACTAAATAT TGCTGAACCT AAAAACAATA CTAAAAAACA TGATACTAAT 10500 

CTTGAAATAA TCGCTTGTAG CGAATGTACT TTTGTATGTT TAATACATAA TACGACTAAT 10560 

ATGGCGCTTG AAGCATAATT ATCTAAACCT AACAGCTTAC TAATAATTAC ACCTAAAGTC 10620. 

ATACCCACTG CTGTTTTTAT TGTTCTAAAT CCAATCTTGT AAGGATTTAA CTTTAACATG 10680 

GGTTAGCGCC TCTTATCTTT CTTCACAATA TTTATTGAAT AATGTTTGTA ATTGATTAAT 10740 

TACGTTCATC ACATCATGAC CTTCGATTTG ATGTCTTTCA ATCATTTCTG TAATCTTTCC 10800 

ATCTTTTAGT AATGCAAATG ACGGACTTGA AGGCGCATAA CCTTCGAAGT ATTCACGCGC 10860 

TCTTTGTGTC GCTTCTTTAT CTTGTCCAGC AAATACTGTC ACTAGACGAT CAGGTAATAC 10920 
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AGAATTGATC 
AGTAGTTAAT 
GTTCATGTAT 
ACT At CCTCA 
ATATTTAGAC 
AATTTGTTCA 
GCAAAAATGC 
TATTTTAGTG 
TAGATGCATC 
CGCTGTTTAA 
ATTGAAATAC 
TTTTTAACGA 
GATGATACTG 
TCAGCTGTCA 
CCTTTGATTG 
GCAACAGCAA 
CTATTTAATA 
AACGTTAGAT 
ACAAGATTTG 
TTAACCATAT 
CTATTGTCTT 
TGTTTGTCAG 
GTTACACGAC 
GCGAGTTTAA 
GTAGAtGTCT 
ACTTTTGGTT 
CAGATAATTG 
CCTGATATCG 
TCATATTCAT 



ATAACTAGTG 

TGCTCATATC 

AAATCGAAAT' 

TTCTACTAAT 

ACAATTTTAA 

CATGTTTTCA 

ATTCAACCAT 

CCAAAAAATA 

TATGTTATCA 

TATGATTCAT 

ATAAATTAAC 

TTGATTCTAC 

AACCAAATGT 

ATTGCTTATT 

ACTTTTCGTC 

TTGAAATATT 

AAGGATATGC 

TATATCCTTC 

TAGCATCTAC 

TTTGCGCAAT 

CAGATGATTG 

ATTGAGCTGT 

CTTCAAATCC 

ATACAACAGG 

GTTCCACTGT 

GTATCTCTTC 

TATCAATAGC 

TGGAAGGGAC 

CAATATGATC 



TTGTACCATC 
CCGCAGATTC 
TCATGnCCAT 
TAATAACATA 
CAATATACCA 
TTAATATGTT 
GTTGATTATT 
ATACATCCAT 
CTAATATATA 
ArATTTAGCT 
CATGTTACGA 
TTGTAAAATC 
ACCAGTATTA 
ACGCGCTTTC 
TGCATGCTTA 
AATGTCTTTA 
TTTTAAAGCA 
TTTATTTTTA 
TTCAATGATC 
TGCTTTACGC 
GTTACTTGAT 
GGTACCACCA 
ACTACCTACA 
TGAAAAGCGA 
TGCAGTAGCT 
AGTTGTTTCA 
TACTGTCTGC 
TTCAGCTGTC 
ACCAACAGAA 



TTGTTTAAGA 
AATTTCATTC 
AAGTTCAATC 
TTGTTCAATA 
AACATTATTG 
TCAAGTATGA 
GTTCTTTATC 
CGACAAGAAC 
TTTGTATTTT 
GTTTGTAAAC 
ATTGCAATCA 
GCTGCTTGTG 
TTTACCGTAA 
GTTGGTAAAG 
ATCACAGGTA 
TGTAAGACAA 
TCTGCTACAG 
AAGCTCTTTT 
ATCCATGCAT 
ACACCATTTA 
GTATCTACTG 
TTTTCAATAA 
ACTTGTGATA 
CCATTATTAC 
TTTTTAGTAG 
TTTGTCTTTT 
CCCGCTTCAA 
ACTTTATCTG 
ACTAAGCATT 



ACTTTGtCAA 
CTTGCTTGTT 
ACCTATCCCT 
AACTAATCTG 
TGCTTAAAAT 
TGTCTTATTT 
TTTTTTGAAT 
AAGATAAAAC 
CTAAAGTATA 
CATCTAAAAT 
TATCATTAAT 
GATGATTTAT 
ATGTACCGCC 
TATTAATTTC 
CGTATAATTT 
TTTCATTTCC 
CTTTTACAAA 
TATAATGATT 
GTGGAATCTC 
CTGGTATTGT 
ATGTTGATTT 
CTGACATTAT 
AATCAATGTC 
GTGGTTGATT 
ATTTCTGAGT 
CATCAGCAGT 
CTAAAATTTC 
TAATAACTTC 
GTTCAATGGT 



CATCTTCTGC 
CTACAACACC 
TTATATTTAA 
AATCACACCT 
CATGGTAACT 
TGACTTTACT 
ATATTGCACA 
AAGTTGTCGA 
CTGTTCGATA 
ACGATGATCA 
TACTACTGGC 
AATACCCATT 
CTGCATATCT 
TCTAGCTATA 
ATTTTCATCA 
TTGCCAGCTA 
GAAAGCAAAG 
TCTCGTATTC 
TGTTACACTA 
GCTGTTTTCA 
TGTTTGAACT 
ATCCTTCTTA 
ATGCTCTGAA 
TTGTTTAGCA 
ATGCTCATCC 
TTCAATTTTA 
TGTAATTGTT 
ACATAATGGT 
GC CTTCATGA 



11040 
11100 
11160 
11220 
11280 
11340 
11400 
11460 
11520 
11580 
11640 
11700 
11760 
11820 
11880 
11940 
12000 
12060 
12120 
12180 
12240 
12300 
12360 
12420 
12480 
12540 
12600 
12660 
12720 
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AATTCACGCA TTTTATTTAA GATTTTTTCT GGATTCATCA TAATTTCATT TTCTAATACA 12 84 0 

GGAGAAAATG GCATAGATGG TACAtCTGGA GCAGCTAAAC GCATGATTGG TGCATCTAAA 12900 

5 

TCGAACAAGC AATGCTCTGC AATAATCGCT GACACTTCTG ACATAATACT ACGTTCTAAA 12 960 

TTATCTTCAG TTACAAGTAA AACTTTACCT GTATGTTTAG CACGATCAAT AATTGTTTCT 13020 

TTATCTAATG GATAAACAGT TCGTAAATCA ACGACTTCAA CATTGATACC GTCTGCAGCT 13 080 

10 

AAAATATCCG CTGCTTGTAA ACAATAATTG, ACCATTAATC ,CATAACAAAA TACTGTTAAA , 1314 0 

TCTT CACCTT CACGTTTCAC ATCTGCTTTT CCTAAAGGTA CAGTGTAATA TTCTTCTGGC 13200 

r5 ACTTCTTCCT TTAAGAAACG ATAAGCTTTT TTATGCTCAA AGTACAATAC TGGATCATTT ■ 13260 

GATTCGATAG ATGATAATAA AAGCCCTTTA • GCATCATACG GTGTGGAAGG AATAACAATT 13320 

GTTAAACGTG GCGATGAAGC AAATATACTT TCAATACTTT GTGAATGATA TAGTCGTCCG 13380 

20 TGAACACCGc CACCAAATGG TGCACGAATC GTTAATGGGC ATTGCCAATC ATTATTTGAA 13440 

- CGATAACGCA TTTTCGCAGC TTCACTAATA ATTTGATTTG TCGCAGGTAA AATAAAATCT 13S00 

GCAAATTGAA TTTCTGCAAT TGGTCTTTT A CCTACCATAG CTGCACCAAT GGCAGTTCCA 13 560 

25 

ACAATATTTG ACTCAGCTAA TGGCGTATCG ATAACTCTGT CTTCAC CAT A TTTTTGTTGC 13 620 

AGTCCTTGAG TAGTACCAAA TACGCCACCT TTTCTACCAA CATCTTCACC AAGAATAAAC 13680 

ACATGTTTAT TTTGTTGTAA TGCTAAGTCT TGTGCCtGcG TATCGGCTCT . AAATAAGATA 13740 

30 

ATTTAGCCAT TAGTTAAGAC TCCCTTCTTC GTACACAAAT GCATAGGCTT CTTCGACACT 13 80 0 

TGGATATGGC GCGTCTTCAG CAGCCTTTGT CGCTTTATTG ATGATGTCTT TnATgTCCGC 13 860 

TTCTATTTCT GCCAACCAAG CATCATCGAT AATGCCAGCT GAAAGCAACT CTTTTTTGAA 13 920 

35 

CTTTTCATTG CAGTCTGCTT TTTTAAGcGT TTCACGCTCT TCTTTCGTAC GATATTGGTC - 13980 

GTCATTCATCT GATGAATGAG CTGTCATACG ACTTGTTACT GCTTCAATCA AAGTTGAACC 14 04 0 

40 TTGACCAGAA ATAGCTCGAT CTCTTGCTTC TTTCATCGCT TTATACATTG CTAATGGATC 14100 

ATTACCATCT ACTTGTTCAC CATGTATACC GTAACCAAGT GCTCTATCCG ATAATTTTTC 14160 

AGCTGCGTAT TGTAATGAAT CAGGTACTGA AATTGCATAT TTATTATTTA TAATGACACA 14220 

45 TACAAAAGGA AGTTTGTGTA CACCCGCGAA GTTTAAACCT TCATGGAAGT CACCTTGGTT - 142 80 

TGAGCTACCT TCACCAACAG ' TTGCTGTTGC AATTTTCTTC TTACCATCCA TTTTTAAAGC 14340 

TAAAGCAGCA CCAACAGCAT GGGGTATTTG AGTTGCTACC GGTGAACTTT GAGACAAAAT 14400 

50 

ATTCTTAGCT CTACXACTAA AGTGTGATGG CATTTGTTTT CCACCAGAGT TAACATCGTC 14460 

TTTCTTTCCA AACGCTGATA AAAACGTATC ATACGCTGAG ATACCCATAT AAGTAACGAA 14520 
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AATCTGAGTT GCTTCTTGTC CTTGACCACT TACAACAAAT GGAATTTTAC CTGCACGGTT 14640 

CAATAACCAC AGTCTTTCAT CTATTTTTCT ACCTAAATCC ATCCATTTAT ATATTACTTT 14700 

5 

TAGGTCTTCT TCGCTAAGGC CTAATGATTT ATAATCAATC ATGTTAAATC CTCCTATTTA 14760 

TACGTGAATA GCTCTACTTT CTGCTTTCAA TCCTAATTCC ATCAACACTT CAGAGATGGA 14820 

AGGATGTGCG TGTGTTGTTA GTCCTAATTC TAATGCCGAG CCATTCATGA ACTGTAACAG 148 90 

10 

TGATGCCTCA TTAATCAATT CTGTTACATG TGGACCAATC ATATTAATAC CCACAATTTC 14940 

TTCAGTTGAT TGATCAATCA CCATTTCGCT ATACCCTTCG TTTGTGTCAT GG CTATCAAT 15000 

15 CACTGCTTTA CCAATTGCTT TAAATGGTAC TTTAAAACTT TTAACTTTCA TTCCCTCTGC 15060 

CTTTGCTTGT TCAATGTTTA AACCGATAGA AGCAATTTCA GGTTGTGAAT AAATACACTT 15120 

AGGCATCATG TTATAGTTTA CTGGGATTGG GTTCCCCTCA AACATATGAT CAACAGCCAC IS 180 

20 AACACCTTCT TTTGATCCAA CATGTGCCAA TTGTAATTTT C CT AT ACAAT CACCAGCTGC 15240 

ATAAATATGT TTATCTTCAG TTTGTTGAAA TTCGTTCGTT AAAATATGTC CTGATGTTGa 15300 

AAG t TTTATT TTAGTGTTGT TTAAACCAAT ATCTGATGTG TTAGGTTTTC TACCAATCGA 15360 

25 TAGCAACACT TTATCTACTT TAATTATGTC TGAGGAAATT TCAAACGTAA CACCATCTTC 15420 

GTTAACATTT ATATCATTTT CAGAAAGTTT TATTCCCTCA TAGAATTTAA CACCACGTGC 154 80 

TGACAATGAT TTTTTTAATA GTTGTGAAGC TTGTTTACTT TCAGTTGGTA AAATTCTTTC 15540 

30 

ACCTGCTTCT ATAACTGTTA CGTCAACACC TAAATCTATC ATCAATGATG CAAATTCCAT 15600 

TCCGATAACA CCACCACCAA TAATACCAAT ACTTGATGGT AACGtCTTTA ATGATAATAT 15660 

ATCATCGCTA GATAAAATTT TATCATGATC AAATGATAAG AATGGCAACT CTGCAGGCGA 15720 

35 

AGAACCAGTT GCAATTAATA CAAATTGGTT GGGTAATAAG TCTGATTCAC CATCTTCATA 15780 

TTCGACAGAA ATTGTGCCAC TTTGAGGTGA AAATATAGAT GTACCTAGAA TACGTCCCGT 15840 

40 GCCATTATAA ATGTCAATGT GATTGTGTTG CATTAAATGC TTTACACCTT GATACATTTG 15900 

ATTAATAATG TCTTCTTTTC GTGCCAACAT ATTTTCAAAA TTAACATTAG CATCTTTGAC 15960 

ATCAACGCCA AACATTGCTG CCTGTTTTAC TGTTTGAAAT ACTTCAGCAG ATTTAAGGAG 16020 

45 CGATTTAGTA GGAATACAAC CTTTATGGAG ACAAGTACCT CCTAATAGTT GTCGTTCTAC 16080 

TATTGCCACT TTTTTACCTA ATTGAGACGC ACGTATCGCA GCAACATATC CTGCAGTACC 16140 

TCCACCGAGA ACGACTAAAT CATATTGTTT CTCTGACATG TTGTTACTCC TAACTAATGA 16200 

50 

TATATATCCA TTGAAAATTT ATTAATACAT AGTTTTCATG TCCATTAATT ACCTATTTTA 16260 

CATGATTGTC TATTTAGTTT GAATGCACAT AAATAAATCC ATAAATGAGT ATTCAACACA 16320 
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TAAATCAGTA ACACTTGCAC CTGAAATCAT TCGTGCAATT TCATCTACTT TATCATCGCT 1644 0 

AATTAACTCT TGAACTTGTG TTGTTGTACG ATCATCTTTT GATGATTTCG AAATTAATAA 16500 

5 

ATGATGGTCG CTCATCGATG CAACTTGTGG TAAGTGAGAG ATACAAATAA CTTGTATATA 16560 
TTCTGCTaTA TCTCGCATTT TCTCTGCCAT TT 16592 
(2) INFORMATION FOR SEQ ID NO: 54: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13794 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
15 (D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 



20 


CCAATACAAC 


GTAAAAAGAT 


TGCTTGTGTT 


ATTAATGAGT 


TAGATAAAAT 


AATTAAAGGA 


60 




TTTAATAAGG 


AAAGAGACTA 


CATAAAATAT 


CAATGGGCTC 


CAAAATATAG 


CAAAGAnTTT 


120* 




TTTATACTTT 


TTATGAACAT 


TATGTACTCA 


AAAGATTTTT 


TAAAATATCG 


ATTTAATTTA 


180* 


25 


ACATTTCTTG 


ATTTATCTAT 


CTTATATGTA 


ATATCATCTC 


GAAAAAATGA 


GATACTAAAT 


240 




TTAAAAGATT 


TGTTTGAAAG 


TATTAGATTT 


ATGTATCCTC 


AAATTGTTAG 


GTCAGTTAAT 


300 


30 


AGATTAAATA 


ATAAAGGTAT 


GCTAATCAAA 


GAACGATCCC 


TTGCAGATGA 


AAGGATTGTG 


360 


TTAATCAAAA 


TAAATAAAAT 


ACAATATAAC 


ACTATTAAAA 


GCATATTCAC 


AGATACTTCC 


420 




AAGATTCTCA 


AACCAAGAAA 


ATTTTTCTTT 


TAAATTTAAA 


CAGATTTACC 


TCTTGATAAA 


480 


35 


ATAAATAAGC 


AATCATACTA 


CTTCTCAATT 


TAGTATAAAT 


AAAAATACAT 


AATTAACTTT 


540 






TATATTATTT 


CAATACCCTA 


CTATATATCA 


CAACACATAA 


ATTAAGCATG 


600 




ACACTCATTC 


AATTTAGTTC 


ACCATTTCGT 


GTTCCAATTT 


TACTGAGTAT 


CATGCTTTTA 


660 


40 


ATGTTATAAA 


CCTAATGCTT 


TAATAAATCG 


TGTTAATTCT 


TCTCGCATAC 


TGTCATCTTT 


720 




CAATGCATAT 


TCTATGGTAG 


TTTTAACGAA 


GCCTAATTTT 


TCTCCAACGT 


CATAACGTTC 


780 




GCCTTCGAAG 


TCATATGCAT 


ACACTTGGTT 


ATCATTATTC 


ATACGTTCAA 


TCGCATCTGT 


840 


45 


TAACTGAATT 


TCGTTACCTG 


CGCCTTCTTT 


TTGCGTTTTT 


AAATAATCGA 


AAATTTCAGG 


900 




CGTTAATACA 


TAACGTCCCA 


TAATAGCTAG 


GTTTGATGGT 


GCCGTACCTT 


GTGCTGGCTT 


960 


SO 


TTCAACAAAC 


TTTTTCACTT 


CATACTGACG 


TCCGTTTTTA 


GTTAATGGGT 


CAATAATTCC 


1020 


ATAACGATGA 


GTATCTG CTT 


CCGGAACTTC 


TTGGACACCT 


ATAACTGAGT 


GCCCTGTTTC 


1080 




TTCATAAACG 


TCAATCAACT 


GTTTCACTGC 


TGGCACTTCA 


GATTCAACAA 


TATCGTCACC 


1140 
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TAAACCTTTT TGTTCTTTCT GCCTTACATA AAAAATATTC GCAAGTTGCG TTGAATACTG 1260 

AACTTTCTCT AGTAATTCAG ATTTACCTTT TTCTTTTAAC ACCATTTCTA ATTCTTTTTG 1320 

5 ACTATCAAAA TGATCTTCAA TCGCGCGTTT GTGGCGACCT GTCACTATAA TAATATCTTC 13 80 

AATTCCAGCT CTTGCAGCTT CTTCAACGAT ATATTGTATT GTGGGTTT AT CTAAGATAGG 1440 

AAGCATTTCC TTTGGCATCG CTTTAGTTGC TGGTAAAAAT CTAGTCCCTA AACCAGCAGC 1500 

10 

GGGAATGATT GCCTTTTTTA TTTTTTTCAA AGTTAATGTG CTCCTTTTCC TAAGTATTAA 1560 

ATCTATGTAT CAACGTCATT TTAACACTAA TTAGAACGCC TTCATAGTGT CATTGAGTAT 1620 

GTAATTATTT CTTGGGAAAT TTGTTTTAAT TTTAAAAAAC AGGCTTACTT CATATAATTT 1680 

15 

ATGAAATAAA CCTGTCAATT TTGGATTGAT TATGCTTTGT GATTCTTTTT ATTTCTGCGT 1740 

AATAACGCTA AACCTAAAAT GCTAAATAAT CCGCCGAACA ACATGCCGTT GTTTGTTGAT 1800 

2 0 TCTTCTCCAC CTGTTTCAGG TAGTTCAGAT TTCTTAGATT GTGCTTTTTT AGTTGGTACC 1860 

ACTGCTTTAA CCTTTTCATT GATTTCAATA ACAGGTGTTA CTACTTTACC TTGTTCCACT 1920 

GGTTTAGAAG GTTTTTTAGG TTCTTCTTTA GCAGGTGGTA TTGGTTTACC AGGTTCAGTT 1980 

25 GGTACCTCTG GCGTTGGCGG TGTTGGTGTT TCCGGCTCGC TTGGTAGTTC TGGTGTGGGT 2040 

GGTGTTGGTG TTTCCGGCTC GCTTGGTACT TCTGGTGTCG GTGGCGTTGG TGGCACGATT 2100 

GGAGGTGTTG TATCTTCTTC AATCGTTTGT TGACCTTCAT TATGACCACT TACTTGTGGA 2160 

30 - 

AGTGTATCTT CTTCAAAGTC AACACTATTG . TGTCCACCGA ATTGATAATT TGGTTTATCT 22 20 

TTATTTGTAT CTTCTTCAAT AATTTCAGTG TGCTTATTGA ATCCGTGAAT ATGTGGCACA 2280 

CTGTCGAAGT CGATATCAAT GATATTACGA CCTTGTTCAT ACTTAGGTTT GTCTTTCTCT 2340 

35 

GTATCTTCTT CGAATGATTG GTTACCATTA TTTTGACCAT GAATTTGAGG TACACTATCG 2400 

"AAATCGATAT CTACGATATT GCCACCTTGT TCATATTTCG GTTTATCTTC TTCTGTGTCT 24 60 

TCCTCAAATG ACTGATTACC GCTATTTTGG CCACCTTCGT AACCTAATTC ACTCTTAATA 2520 

40 ' 

TCCACGTGGC TATTTTCTTC GATTTCTTCA ATCACGCCAT AATTACCGTG ACCATTTTCA 2580 

GTTCCTAAAC GAGAATGAGA AATATGATGA TTGTTTTCAG TAATTTCCTC GATTGGTCCT 2640 

45 TGCGCTTGAC CATGTTCTTC AGGTAGTTCA TCTACTAGTT CAATCAGATT ACTTTCAGTC 2700 

GTATATTCTT TCGTATCTTC AATTGTTGTA TGATCGCTAA CAGCACCAGT TACAATACCT 2760 

TTTGTAGAAT CTTCGTCAAA TTCAACTAGG TTAGACTCAG TAGTAACCTG ACCACGACCT 2820 

60 GGGTTTGTAT CTTCTTCATA TTCAACAACA TCAGCATGAT GTTTTGAATT TTCATGTGTC 2880 

GATTCTTCAA AGTCTACATG AATAGAATCT TCTTCAGTTT CAATGGTACC TTCTGCATGA 2940 
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, TCTTCGATTG TACCAGTCAA TTCATGCTTC TCCACTGGCG GCTCTGATTT AAATTCAAGT 3060 

TCGATAGGAG TACTATGTTC TATAATAGGT TCCTTTAGTT TATCTTTGCC GTCGCCTTGA 3120 

5 

GCGTTATTAG AGTAAAATGC AACGCCATTT TTC CaAGTTA AATTACTTGT ATAATAATAG 3130 

TTATAATATC CAAAAAGGTG TGTTTGAAAT TCTAAGTTGC TAGCATTTGA ATCATAATAC 3240 

CCTTCATATT TTATTACATA ATTTTTACTT TGGTCTAAAT TATTAAAGTT TAAAGAATAA 3300 

10 

CCACCATTAG TATCAAAATC TAAACTCATA TTATGAGTCA CATCTTCAAA TTTGCTGACA 3 360 

TCATCAAGCT TTGCATAnTn AgctTTCAGC TAAATCGTCT GAACCAATGT GTTTATATAC 3420 

15 CTTAACTGTT GGATTATTAA CCCCTGGTTT ATTTCCTTTA GTTACTTGAC CAGTTACTGT 3480 

CACAGAGCTT AACGACTGGT TGTTAGGTTT CATGTACGCA AAATGACTAA ATTTCCCATC 3540 

TACTTTATTT AAAGTATCAA TTCGACCATT AGCTGTTACT CCCCAATTAT CTCTAACTCC 3600 

20 ACCTAAATAT TGAATATTAA ATATTTTGCT AACCGTAGTC TCACCCAATT TAACTTCAAC 3660 

ATT TT GGTTA CCTTTTTGCG TCACTGTTGT AGGATCAATA AATAGATTTA AAGATAATTC 3720 

AGCAGTTAAA TCTTTCTTTT CTTGTACATA TTCTTTAAAC GTATATCTAA CTTTTCTTTC 3780 

25 - 

TCCAATTATT TCTCCTGTCG CCATAACTTG ACCATCTGTA CTTTTTATCT CCGGAACTTT 384 0 

ACGCAGTGTT GAGATACCAT GA G T TT CAAC ATTATCGCTT AATGTGAAAT CAAAAT AATC 3 900 

TCCCGCCTTA ATTCCTTCTC CAAATTTCCA TTTATATTTC AAGGTTACTC TTTCTGCGTT 3 960 

30 

ATGAGGATTT ACAACATTCG TATCTTGTTT ATGTCCTACA ATTT CACTAC CTTCTTCTAC 4 020 

TTCCACTTTA TTTGTTACAT CTGTACCTGT CGCTTTAGTT TCTTCCACTA CTTCTTTCTC 40 80 

35 TGCAACTGCT GTAACGTCAt TGatCTTTTC ATTCTTGGTT TAATTTCTGA GACGTTACTT 414 0 

GGTTGAGCTA TGTCAACTTG AGTTC CTGTA GTTTCCTTAT CAGCAACTTT TTCCGATGGC 4200 

AAATCAACTC GCGAAgTTTC TACTTTTGGT GCTTGCAcAG TTTTCGGTGC TTCTTCTGTT 4260 

40 GTTACTTGTG TTGATTGTGA TGGTTGCTCA GTTGATGTCG CGCTGTATGA TTGTGTTTCA 4320 ' 

TCTATTGTAT TAACGTTATT TGTAGTTGTT TGTGTTTCGC TTGCTTTACT TTCAGTAGCT 4380 

GAACTCCCAC TTTCCTCTAC TGTAGTATTG TTTTGTTCCG ATGCTGCAGC TTCTTTTTCT 4440 

45 TGTCCCATTC CAACAACGAT CATTGTTCCT AAGAATACTG AGGCCGCTCC CAATTTGTGT 4500 

TTTCTTATGC CGTATCTAAG ATTGCTTTTC ACTATAATAT TCTCCCTTAA ATGCAAAATT 4 560 

CATTTATTTT TAAAACTCAA TAAATGCAAT TCTATATTGT TCGGTTTTTA AAAGCAATGA 4620 

50 

AAAAAAGCGA GTTAATAAAA AGTTAAGATT GTTGTTAACT TTATGTATAA TGAGTTTTTT 4680 

" ATTATTTGAA ACTCACATAT ATATTGCATA CAAAGCTCTT GAACACCTTG ATATAACAGG 4740 

55 
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TACTAAACCA TACATAATAA TCGCCTGTAC AATGCATCAT TAACAAGTCA CTGAAACGCC 4 860 

TTTCATTGTA TTAATAACGT CACTATAATT TTTATATCGT TCGGTTTTTG TTTGATTTTA 4 920 

5 

ATGATTATTT ATACAAAAAC AGCCGTATTT CAAGCGGACA TTTTAAATTT AACTAAATTT 4980 

GCATCTAGTT AATAATTGCA TTTATCAAAT TTGTCTTATT GATCCAATCT AATTTGTACT ' 5040 

CACAAACTAG TTTAAAATTC TAACTTTATC TCTCAGTTCG TTATCAATCA TCAGACATAA 5100 

10 

ACCAATGAAG CAATCAGAAA ACACTCTAAT TTTCTATTAG AAATTTGATT TAATATAAAA 5160 

AAACAGGCTT ACTTCATATA ATTTATGAAA TAAACCCGTC AATTTTTGTT TAATTATGCT 5220 

15 TTGTGATTCT TTTTATTTCT GCGTAATAAT GCTAAACCTA GAATGCTGAA TAATCCGCCG 5280 

AACAACATAC CTTTGTTTGT TGATTCTTCT CCACCTGTTT CAGGTAGTTC AGATTTCTTA 5340 

GATTGTGGTT TTTTAGTTGG TGCCACTGCT TTAACCTTTT CATTGATTTC AATAACAGGT 54 00 

20 GTTACTACTT TACCTTGTTC CACTGGTTTA GAAGGCTTTT TAGGTTCTTC TTTGGCAGGT 5460 

GGTACTGGTT TACCAGGTTC AGCTGGTACC TCTGGTGTTG GCGGTGTTGG AGTTTCTGGC 5520 

TCACTCGGCA CTTCTGGTGT CGGTGGTGTT GGTGTTTCCG GCTCACTTGG TACTTCTGGT 5580 

25 GTTGGTGGCG TTGGTGTTTC CGGCTCACTT GGTACTTCTG GTGTCGGTGG CGTTGGTGGC 5640 

ACGATTGGAG GTGTTGTATC TTCTTCAATC GTTTGTTGAC CTTCATTTTG GCCGCTTACT 5700 

TTTGGAAGTG TATCTTCTTC AAAGTCAACA CTATTGTGTC CAC CGAATTG ATAACTTGGT 5760 

30 

TTATCTTTAT TTGTATCTTC TTCAATAATT TCAGTGTGCT TATTGAATCC GTGAATATGT 5820 
GGCACACTGT CGAAGTCGAT AT CAATG ATG TTACCGCCAT GTTCATACTT AGGTTTGTCT . .5880 

TTTTCTGTAT CTTCCTCGAA TGACTGATTA CCTTTATTTT G AC CATGAAT TTGAGGTACA 594 0 

35 

CTATCAAAAT CGaTATCTAC GATATTGCCA CCTTGTTCAT ATTTAGGTTT GTCTTCTTCT 6000 

GTGTCTTCCT CGAATGACTG GTTACCGCTA TTTTGGCCAC CTTCATAACC TAATTCACTC 6060 

40 TTAATATCAA CGTGG CTATT TTCTTCGATT TCTTCAATCA CGTCATAATT CCCGTGACCA 6120 

TTTTCAGTTC CTAAACCAGA ATGAGAAATA TGATGATTGT TTTTAGTAAT TTCCTCGACT 6180 

GGTCCTTGTG CTTGACCATG CTCTTCAGGT AATTCATCCA CTAATTCAAT CAGATTACTT 6240 

45 tCAGTTGTAT ATTCTTTCGT ATCTTCAACT GTTGTATGAT CGCTCACtGC GCCAGTTACA 6300 

ATACCTTTTG TAGACTCTTC GTCAAATTCA ACTAAGTTAG ACTCAGTAGT AACCTGACCA 6360 

CCACCTGGGT TTGTATCTTC TTCATATTCA ACAACATCAG CGTGATGTTT TGAATTTTCA 6420 

50 

TGTGTAGATT CTTCAAAGTC AATTGGATTT GATTCCTCAG AGGACTCAGT GTATCCTCCA 6480 

ACGTGACCTG ctTCGCTATC CACAGCAGTA TGGTAATCGA TATCAATAGC TGATGAATCC 6540 

55 
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10 



15 



TGGTAATCAA TGTCAAGAGT TGATGAATCA TATT CCTCTT CAACAGTAGT TACTAAATTC 6660 

TTATCATATT GACCTGTAAG AGTTTCTTTA ATTGTATCTT CTTTATATTC AAATTTATTA 6720 

TTTTGAATAA TCGGACCATT TTTCTCATTT CCGTTOGCTT TATTACTGTA TAAAACTAAA 6780 

CCATTATCCC AAGTTAAGGT ATATCCTCTA TCATAATAAT ACTTATAAAG TTGCTCTGGA 6840 

TGTCCTACCA TTTGTGTTCT AAAATCAACT TCATCAGTAC CATTTAAATA CTCTCCATCA 6900 

TAGTGAACAA CATAAGTTTT ATCTAGATTT TCTATATTCA ATGAATAGCT TCCATTATTT 6960 

TGTAAATTCA AATTCCCACT CATATTACTT GTGACTTCTT TAAATTTAGA AGTATCTGTC 7020 

GTATTTGCAT ATACACTCTT CGCTATGTCT TCATTATTAC CCAAGTATTC AAATATCCTA 7080 

ACTTTTGGTT GATTTCCATT CTGATTACTA CCTTTCATTA AAGTTCCAGT AACAGTCACA 7140 

CTTGTCGTTT TACCATTATT AGGTTTAATA AATGCAACAT GCGAAAATCT ATTATTCGCT 7200 

20 TTATTAAATG TCTCAATCGA TCCATTTAAA TTGGCATAAT AATTCCCAAT ACCATCTTTA 7260 

TATTTAACAT CTAATTCCTT TGAAGTTTGT TCTTCATTTA GTGTTGAAGT TATAGTTTGA 7320 

TTTCCATTAG TTTGTACAGT TTTAGGATCA ATAAATAAAT TAATTTCTAG TTCAGCCGTT 7380 

25 ACATCAACCT TAT CTTCAAT ATCATTTGTA AATGTATATC TAATCTTTCC ACCTTCTAAA 7440 

ACTTCACCTG TCGCCATTAC GACTGAACCA TTTTTAATTT CTGGTACT.TT TCTAGCAGTT 7500 

GAT ACGC CAT GCGTATTTAC ATTATTTGAT AAAGTAAAGT CAAAGTAGTC ACCTTGATGT 7560 

AAACCATTCT CAAATTTCAA CTTATATTTT AGTACCGCTC GTTGTCCTGC ATGAGGTTCT 7620 

ACTTTATTTG TATTGTTATG CCCCTCAATA GAACCAATTT CTACTGTAAC TTTACTTGTT 7680 

ACATCTGTAC CCGTTTCCAC TTTCGCGTTA CTAGCTTCCT * TAGCTTCCGC TACATCTGCT 7740 

GATCTTGTCA CACGTGGCTT ACTTTCTGAT GCCGTTCTTG GCTGTGCCAC ' TTCAACTTGT 7800 

GTTTCTGCGA CTTGATTTTG TGTAGCCTTT TTAGGTGTTA AATCTACTTG TCTTTGATCT 7860 

40 ~ CCGCTATTGT CTTGAGATTG TGTTGTTTCC TTAACTTGAG GTTTCGCTTC TTCCTTAACT 7920 

ACCTCTTCTT TAACTGTTTC TATATTTGCT GGTTGTGCAG TTTGTGGTGC TTGTACTGCT 7980 

TTTGGTGCTT CTTCAGTTGT TACTTGTGTT GCGTTTGACG GTTGTTCTGT TACTGTTGCG 8040 

45 TTATATGATT GAGTTTCTTC TATATGATTA ACGTTAGTTG CAGTTGTTTG TGTTTCACTT 8100 

GTTTTATTAT CAGTAGCTGA ATTCCCATTT TCTTCTACTG TAGTTGTCTT TTGTTCTGAT 8160 

GCTGCAGCTT CTTTGTCTTG TCC CATC CCA ACAACGATCA TTGTTCCTAA GAATACTGAT 8220 

SO 

GCTGCTCCCA ATTTATGTTT TCTAATGCCG TACCTAAGAT TGTTTTTCAC TATAATATCT 8280 

CCCTTTAAAT GCAAAATTCA TTAATTTTTT AAACTTAATA AATGCAAGTC TATATTGTTC 8340 

55 



30 



35 



417 



EP0 786 519 A2 



ATGTTAATTG ATAATTTTAT TATTTGAAAT 
AACCCTTGTC ACACAAGGCT TGTATTTTTT 
5 ATCTAATTTA AAACAATATA CTAAACGTTT 

AACATGTCTT GAAACGCCTT TCATTACTCT 
GGATTCTGAG TATTTCAGAC GATTTTCTGC 

10 

TTGCAATTAC CTAAAAACAC GTTTACTTAA 
AAATGAAGAT GATACCTGAA ACGGAAATAA 
TTTCTTTTAC AGTTAAACCA AAATATTGTT 

15 

GAGACAAAAT CACACTACCT GCACCTATCG 
ATGATTGTAA TAATGGTAAG ACAATACCTG 

20 CTAATGCGAT ACGTAGCACA GCTGCAACAA 

TACCTTCAAA CATTTTAGCA ATTGTATTTC 
ATGTACCGCC ACCGCCAATA ATCAATAACA 

25 CTGATTCCAT AATATGATTC ATCTTACGCT 

ATAATACTGC TATTAGCATG GCTGTCCCTG 
ATAGATTTGT AGGTTTGTCA TGCCCAGTTA 

30 

ATATGACTGG TAATGTTGCT GTTAATAAAC 
TAAATTCTTT TTGTGCACCT AACGCTGAAA 
TCATTTTTTG TGCAcTTTGT TAAATATAGG 

35 

AATCATACCA TACAGTAATA CATCTCCAAC 
CGGTCCTGGA TGTGGTGGTA AAAAGCCATG 
TCCTAGTTTT AACACTGAAA CATTTGCGCG 

40 

TAAGACTAAA CCTAGTTCAA AGAACAATGC 
TGCCCATTGT ACATGTTTTT GACCAAATTT 

45 ACCACCACCA TCAGCAAGCA ATTTCCCAAG 

GTGGCCGAGC GTACTGCCCA TTCCTTTCTC 
ACCTAGCATT AACGCTGTAA TCATCGATGT 

50 AAACCCAATA ATTAATACTA ATAAAATAAC 

TATTTCGTTA AACATGACAT TGCCCTCTTT 

55 



ATACCTATAA ATTGTATTCA AGTCATCAGA 8460 

ATACTTATTT TTTAAATTAA ATTCATCATT 8520 

CATAATTATC GCCTGTACAA TACGCACAAA 8580 

AAAATACCCA ATATACTTTT TATATCGTTC 8640 

ATAAAAATAA ACGTGTTTCA AGGCAATATA 8700 

TATTTAGTTA AACAAATAAG CTAATGAATA 8760 

TCGTTTCTAA TAATGACCAT GTTAAGAATG 8820 

TAAACATCCA AAATCCTGCG TCATTTACAT 8880 

CAAGTACAAC TAATG CAACA TTTACATCTG 8940 

TAGTTGAAAT GGCAGCTACT GTAGCCGAAC 9000 

TCCATGCTAG TAAAATCGGA GACATCTCTG 9060 

CGACACCGCC GTCAATTAAT ACTTGTTTAA 9120 

TCATTCCGAT TGGATAAATC GCATTCGTCA . 9180 

TTCTCATTAA TCCCATCGTA ACGATTGCAA 9240 

CTGTTCGTAT CATATAAATG ATAGATTCAA 9300 

CAAGTTGCGT TATCGTAGAC ACTAACATTA 9360 

TCATACCAAA TCCTGGCATC TCTTGATCCG 9420 

TATCGCCTTC TCGTGTATAC GCAGACGGAA 9480 

CCCTGCAATG AGTGTAACTG GaATGGCAAT 954 0 

ATTTGCCTTT AATTCTTTTG CGATGACTAC 9600 

TGTCACTGAT AAAGCTGTTA CCATAGGTAG 966 0 

TTTTGCTACT GTAAATACTA ATGGAATCAG 9720 

AATACCGACG ATAAATGCTG CAACAAGCAT 9780 

TTGAATCAAC GTGTCTGCGA TTCGAGTTGC 9840 

TATGGCACCT AAACCGAATA TCAGTGCAAT 9900 

AATCGTCTCC ATAATTTTAG TGAATGGTAT 9960 

GATAATTAAT GAAATAAATG TATTTAATTT 10020 

GATACCTAAA ACAACACTGA TTAACGGCCA 10080 

CTCTTTTCAA TAGAATGTAA CACGGTCGTC 10140 
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GAGTGACGTA TTTATTGTGT TTTATTTTCA 

TGTTCATAAT TCTCTGTTAA AGAACGACTT 

5 t 

TAAACAGTGA CATTTTCTTC AATCGGCGTA 

ACGATTGAAA AATCTTCAAT GTCACCTACA 

CATGAACTTT CATAACTTTC AGGAACCACT 

10 

... . , TGACGCCATA CTTCACTTTT CGCAAAACCA 
TTCATTACTT CAATAAGCGC AAGATAGACG 

1S GCAGCGCGAA TCATATGTTC TTTTTTATGA 

GCATTTGCGT TCCAAAGCGG CGCACGTTCT 
TCTGCACCTG GTTTAACACG CTTTGCAATT 

20 AGACGTTTCG CAGTTTCGAC TTCACTCGCT 

ACACCACCAT TATTTACAGG ACCTCCGATG 
AATATTCTAC CTTTGTAATC AGTACGCGGT 

25 . GTACCGATTG TGACAGCAAC TTCTCCTTTA 

ACCCCATCAC TCGCACCAAT AACAAACGGT 
TAACGTTCTT . TCATAC CTTT CAtCACATAC 

30 

TTGGAAATAC CCAGCAGTTC TAATGCCTCA 
ATCCCTGTTG CGGAAGCCAT TGAATAATCA 
ATGTATGTTT TAATATCTGC AAACTTAGCA 
TTCATCCAAA AAATCTTCGC TAATGGCGAC 
TAAATCGCAT TGCCATCATG CACTTCATTT 

40 TCTGCCCAAG TAATATTATT TGTTAATCTT 

TGCATTTGCG CACTAAATGA CACAAACTTA 
ATAACATATT TAATAGTCAT TAGTACTGCA 

45 ACATCAACGT TTGGTGTGTG TAAATCATAG- 

TTTTCATCAT ATAAGACTGA CTTGGTACTC 
TTCATGATAA ATCCTTCTTT CTTTCATTTT 

SO 

CAACATCGTC GAAATTTAAA TGAAACGCTT 
CTGCATCAAT AAACACTTGA TGATTATGAT 

55 



GCGATATGTT GGCGTTGAAA ATCTGCAATT 102 SO 

AAATTGATAA AAATGGATAC GATCTCTTGG 10320 

TGATTGTTTG TGGCACCGAC CATCGATGAA 103 80 

GCTTTAAGTC CGAGCACGCA GGCACCTAAG 10440 

AACTCTGTGT CAAATATATC TGACATCATT 10500 

CCTGTTGCTT TTATCATCTT AGGTGTTTCA 10560 

GTATACAAAT TGTAAAGAAC ACCTTCTAAT 10620 

GATAAAGTTA AACCGAAGAA TGAACCTCTT 10680 

CCTGCTAAAT AGGGATGGAA TATTAAACCA 10740 

TGAGTTAAGA CATCATAAGG . ATCAACACCG 10800 

AGCAACTCGT CGCGCAACCA TCTCAATACG 10860 

ACGTAGTGGT * CCTCTGTTAA GACATAAGAA 10920 

TTATCTATCA CAGTACGAAT CGCCCCAGAT 10980 

CCAACACTAT TGACACCTAA ATTAGAAAGG 11040 

GTATCTTTAT TAAGCCCCAT TAATGTTGCA -11100 

GTTGTTGGAA CTAATTCCGG CAACATTTCC 1116 0 

ACATCCCAAT CTAATGTTTC TAAATTAAAC 11220 

ATGATATATG TATCAAATAA ATGATAGAAA 1128 0 

GTACGTTGAA ATACATCTTG CCATTCATGT 11340 

ATAGGATGAA TCGGTGTGCG TGTTCGCTGG 11400 

ATTACTGTTG CATATTTTGC AGCGCGGTTA 11460 

TGATGTTGCT GATCCATCGC AATCAAGCTA 11520 

ATGTCGTCTT TATTAACTTT GGATTCTCTC 11580 

TCAAATAATT CATCTGGGTT TTCTTCTGAG 11640 

CCTATTTGAT GTTTCATGAT AAAAGTTCCA 11700 

GTCGTTCCAA TGTCGACACC AATCATATAT ' H760 

AATTCAACCA AAATCCTTCA ATAT CTTT AC 1182 0 

CTTTCAAAAT TTGACTGTCG TATTGTTCCA 118 80 

GTATGCGTTC AAAATCTTGC GGGTTCTGTT 11940 



419 



EP 0 786 519 A2 





AAAATGAGTT 


TAAATATTGA 


TGATTAGATG 


CTTTGATTAA 


TGTTTCATGA 


AATTCAAAGT 


12060 




CATGCTTCGT 


AAATGATTCT 


GCATCCTCAA 


ATTTTACTGC 


CACTTTCATC 


ATTTCAAGTT 


12120 


5 


GTTTCTTCAT 


TT CTTTT AC G 


ATAGGTAGTC 


GCTCTTGATT 


TTTAACTCTT 


GAAAATGCAA 


12180 




ATGACTCTAA 


CATCAGTCGC 


AAATCATACA 


TTTCTTTCTT 


TTCTTGTTCC 


CCAAACGGCA 


12240 


10 


" ACACATGTGC 


ACCCATTCTT 


TCTAATTGGA 


TGAGTTGATT 


TTGTTGCAAT 


AATTTAAATG 


12300 


PATfTCGAAT 


TGG CGAACG A 


CTCACATTAA 


ATTG CTTTG C 


CATTTGATTT 


TCAGTGAGTA 


12360 




& fYTT A PPTTT 


AGCTATGTGA 


CCATTCACAA 


TGCCTAAGCG 


TAATTCTGCC 


GCGATACCTT 


12420 


IS 




CATACCTTCC 

X ^^w^ X X>«V« 


AACCATTTCT 


CTGGATATCC 


ATACATCATC 


AAAGTCACTC 


124B0 




Li 1 Wii X X nwi 


fYZAPATAPTT 


GTATACAAGT 


ATGTTAATAT 


AGTTATTATG 


AGTTTGCAAG 


12540 




^^^P^P^P^^^P^P^P 


A fY3 Af5P APT A 


AAATAGTGAC 


CACCCCTTTT 


CGATTTAAAT 


TTAAAGGAAA 


12600 


20 


TQGTUiV- x A x 


f* Af* A P*fZ A ATfZ 


£^ X X X.TU^X, X \J X 


TATGTTGTAT 


GTGGGATATT 


TCTAATTGTT 


12660 






AiVjUvjul 1 in 


GGTACTTCAA 


TGCAATAATG 


CGTTTCATGA 


CAGTTTGGAC 


12720 




Al TCGAA I V-t» 


AUjiui iulL 


uLiViiniyi x 


TCGGT*TTGAT 

X WVr XXX X 


AACTGCCCAC 


AAAGATGGTG 


12780 


2S 


TV /"^ IV R *P Ji T Ji TV* 

AuAAiAlAlb 


XVjVjV_h^\Vj lift 


A PAT A A AT 


AGGCAACCTT 


TTGTTGGTAA 


TAAAAAGT AA 


12840 




LAL LAA 1 v»v_ 


& T IV 21 r* C* A ATP 


ATAAATGGTA 


AAGCAATTAA 


AAACGGCCAT 


ttatttttca 


12900 




iLAAAA 1 ITjL 


ArTTATAATTi 
X lninnivi 


PTAfiAATATT 

V* x nunn x x x 


GAATT ATT C C 


TATAATACCA 


GCACTAATCC 


12960 


30 


AfVMlTjx 1AL1^ 


APflAATAPTT 
ALunn X X X 


TTfATTT CAG 

X X V— XXX 


CTGATTTACT 


GATGACATGC 


TCTATGTCTT 


13020 




1 1 nnu iVjlvjl 


GATTGGAGAC 


GTCGAGG CTT 


CATTTACGTA 


ATATTGAACA 


TTTTTAATTT 


13080 


35 




rVlPTTGTTGC 

V^Vj^w i XVJi i V?^— 


TGTTTAACTT 


GTTGGTTAAT 


TTCTTGTTGT 


TTCATAGTTA 


13140 


\j X /-vr\rt\J X n X x 


GAGCGTCTTC 


AAAGTACCTT 


CAC CTTTT AG 


CAACATATCT 


ATATCGCTTA 


13200 




ACGCSCAACC 


TAAATCTTTA 


AGCAATAAGA 


TTAACTCTAA 


TGTTTGTCGC 


TGTTGTTCTG 


13260 


40 


TATACAGACG 


ACGCTTTCCT 


TCTGTAAATC 


CTTGTGGTTT 


CAAAATACCT 


TTGCGATCAT 


13320 




AATATTGAAT 


CGTTCGTGTT 


GTCACATTGC 


ATAATTTTGC 


GAGTTCTCCA 


GTCGAATAGT 


13380 




TAGACATAGA 


TTCCACCTCC 


TATAATTACC 


ATAGTTGATG 


ACCCGACGTC 


ACGAGCAAGT 


13440 


45 


ACAATTTCCA 


CATTTTAAAG 


AAATTTATTA 


TAC7TAGGCGT 


CTTATTTTTA 


ITjAI 1 ILulA 


A. J 




CCATGTTGAT 


TTACAAACTC 


ACTCAAACTA 


AGTAACACAC 


CTACTAAACA 


TCTACTCTGT 


13560 




TATTTCAGAA 


TGAATTTGTT 


GTAATTTATC 


TTCAACTTCA 


GTAATCTCTG 


TCGCACATTC 


13620 


SO 


TTTCAGTAAA 


TCTCGATACT 


TTTCCGTCTC 


TGCATTGTTT 


TTATAACGTA 


TTTTATGTTC 


13680 




TAAACTTGcC 


CACATATCCA 


TACCTATCGT 


T CT AATTTGA 


ATTTCAACAG 


GCAATACCTC 


13740 
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(2) INFORMATION FOR SEQ ID NO: 55: 

( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1059 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



10 

(xi) SEQUENCE DESCRIPTION:. SEQ ID NO: 55: 





GGATAAGTTC 


AGGTAAATTC 


ATTTCTTTTT 


CAATTTTGAT 


TTTCATTGTT 


TCCGCCCTTT 


60 


15 


TAAAATAAAG 


TTAGTTGCTT 


CTGTTCCTCA 


TATTCCAAAT 


CACTTTGCTT 


TATATATGTT 


120 




TCAAGCTCTT 


CCGCTGTATC 


AAATGTCTTT 


TTCACACCTT 


GCCAACCTGG 


CACGATATGA 


180 




CCGTGAAAGT 


AATAAGTGCC 


ATTTACTACA 


TGGAT ATGTG 


CCACTCGTTC 


GTTATCCTGA 


240 


20 


TACAGATATC 


TCTTAGATCC 


AAAGAATTGA 


TTTAGGTATT 


CTTTACGCGC 


GCTATCTGTC 


300 




ATGGTCATCA 


CTCCTTTTAA 


CAATTAGGCA 


GACCAAACGA 


CATGCATTCG 


TCGTATAGCT , 


360 




CTTCATTACT 


TATGCTTGCC 


TTATAGTTTT 


CAATCACATT 


GCTAACTTCT 


TTATGACTCA 


420 


25 


TTGCTTTAAC 


TTGTTCGTCT 


GTATATTTTT 


CGCAGTCTTC 


TAATTC CAGT 


TGCTCCTGTA 


480 




ATGACATCAC 


AT ATT CAACT 


TGTCTTTGGG 


TTGCCATCGT 


TAACCCTCCC 


ACAAGTCAAA 


540 




AGCTCTTTGG 


ACGTAAAACT 


TCGCCTTTGC 


TAAATCCTCA 


TGACCATTCT 


TTAACGGTGC 


600 


30 


TCTAGACATG 


TATTTGATTG 


CATTACCTAT 


TGCGAATGCT 


AGTTGAGGTG 


GATACTGTGC 


660 




CGTAACCTGT 


TCGATAAAAT 


CTATAATTTC 


AATGTCGCCG 


TATGTGTAGT 


GCGCTGGTTG 


720 


35 


CTTAACATTG 


TCTTGCGCTT 


CGTTCATATC 


TACTTTTCTG 


TT ACTG ATT A 


CGCTCATTAT 


780 


GCTTCACTCC 


ATTTCTTGAA 


CATTTGGTTA 


TAAGTGACAT 


CGAACCAGTA 


CGGATCACGT ' 


840 




gaatSttttt 


GTGGCGTTCC 


ATCATAAAGC 


CATGGTCTTA 


ATCTTCTCTT 


TCTTTCCTGT 


900 


40 


TCATATTCCG 


CTCTCACATT 


TCGTTGGTAT 


CGGTTCAAAA 


TCGCTTTTTT 


TCTGATTTTT 


960 




TCTCTCCCTT 


TTTCTTCATC 


TTTnATtTGA 


CTCTnCATAT 


ATTCAACTTC 


TTCTGTAGAT 


1020 




nTTGAGTCCT 


TTCTTCCACA 


CAATAATTCA 


nCGCCGCGC 






1059 


45 


(2) INFORMATION FOR SEQ ID NO: 56: 









(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30246 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

( D ) TOPOLOGY : 1 inear 
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GAAGTAAAAG AAGAATTAAA TTTAACATTA ACAATGGATG AAATTGAATA TGTCGGGACA 60 

ATTGTAGGTC CTGCATATCC ACAACAGGAT ATGTTAACTG AGTTAAATGG ATTTCGCGCA 120 

5 

TTAACCAAAA TCGATTGGGA AAACGTAACT ATCAATAATG AAATTACGGA TATACGCTGG 180 

ATTGATAAAG ATAATGATGC GTTGATTGCG CCTGCTGTCA AAGTTTGGAT TGAAACTTAT 240 

GGTGGTAAAC ATGACAAATA ATGACACCAT CATGTTACGA CATTATGTCC CACAAGATTA 300 

10 

TTCGATGTTA GAAGCTTTTC AATTAAGTGA AAGTGATTTG AAGTTTGTTA AAACGCCAGA 360 

GGAAAATATT ACAGCTGCAA TGTCTGATAA TGAAAGGTAT CCCATCGTTG TAATGGATGG 420 

1S CAGGCAATGT GTGGCCTTTT TTACATTACA TCGTGGAAAA GGGGTCGCAC CATTTAGCGA 480 

TAACCAAGAT GCAGTATTTT TCAGGTCATT TAGTGTTGAT CAACGTTATC GTAATAGAGG 540 

AATAGGTAAA GTGGTAATGG AAAAATTGGC GTCATTTATC ACTTCAACAT TTCAGGATAT 600 

20 TAATGAGATT GTGTTAACGG TTAATACTGA CAATCCACAT GC CATGGCAC TTTATCGCCA 660 

ACAAGGATAT CAATATATGG GAGATAGTAT GTTCGTCGGA AGACCTGTTC ATATTATGGC 72 0 

GTTAACTATA AAATAAATTA AATTTAAAAG CATCTTTACT CATCGTCGAC CACAACAATT 780 

25 AATGATGAAT AAAGGTGCTT TTTGTTATAG ATCATCGGAC AATTTACTAT AGTAAAAAGC 84 0 

GACCTAGTGA ACAATTGACA TATATCCACA GGTCGCTTAA CTTAAGTTAT ATTGCTAGTT 900 

GCGATTAATT GATAGACTCA TCATTTTTGC GCTGTCGAGA TGGTCTTTTT ATTAAAAATG 960 

30 

CCGTAATCCA AGCCGTAATC GGAATACTGA TTGCAACGGC AATACCGCCT AAAATAATAG 1020 

AAATAAATTC TTGGGCAAAT ATTTTCGAGT TTATAATATG ACCAAATGAA TATTTAAGTT 1080 
TGAAAAACCA AATAAATAAA GCAAGTTGGC CACCAAAAAA GG CAAGGTAA AtCGTGTTCG - 1140 

35 

CAGATGTCGC TAAAATTTCT CTACCAACAC GCATGCCAGA TTGGAATAAT TCGTATTGCG 1200 

TAACBTTgGA TTCACTTGAT GCAATTCATA AATGGGTGAA CTAATGGTAA TTGTTAAATC 1260 

TATCACAGCT GCAATAACAG CAAGAATAAT AGTGAACACC ATAAATTGAA CCATATCAAT 1320 

40 

GCCAATATTC ATTGAATACA CATATGTTTC ATCTTGTTGT TCGGTTGaAA AGCCTTGTAG 13 80 

ATGACCGAAG TAGACCGATA AATAAATGAG TGTAATCAAC AATATTGTTG TAACGATAgT 1440 

45 GCtGgATAAA TGCaGCTTGT GTTTTAACAT TGTAACTATT GAGTACGAAT AAATTACAAG 1500 

CGCCAATAAT AATGCAGAAA AAGAATGTGA CGACATAAAT CGGTACGCCA AAAATAATCA 1560 

ATACAATACT AATAATTAAA ATAGCGAAAT TTAAAAATAG GGTTAAATAA GAGATGAATC 1620 

SO CCTTTTTACC TCCGAAAATT ATCATCAGAA AGAGGAGCAA TAACGCCAAT ATAAATACAG 1680 

CATTCATTGT TTCGCCCTCC TTAATGTTTC AAATATTTCC ATAAACAATA TTGTGATAGG 1740 

55 
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CATCGAAATA GTATAAGTCA CTGTATTGGC 
TGCACCGGAT AAATATGAGA ATAATAAGAT 
5 GATGTTTCGC CCAGCAAGCG CCCATCTCCT 

TTCATG CAT A CCACTAGCAA TTGTAATTGC 
TAACACTGAG GCTAGGAAGA TATCTTTCGG 

10 

TTTAATGCCT TTAC CAT CTG T CAT AT AT AT 
AGTTCCGATA ATTGTACTGG CTATGGTAAT 
CAATAAAGTG AGTATTGTTG AACAGATCAT 

15 

ATTGCTATGT TGAATATGAA TGTAAATTGC 
CGATAAAATC GATTGCAGTC CGACTTTGCG 
ACCAGTGATG ATAACCGTTA AGGTATCACG 

20 

CTTGTTAGAA ATATGTAATA ATACTTTTTC 
CGATTTGACG TACTGATGAT TAATCGTCGT 

25 TTTGACTTTT AATTGATTTT TATATTTAAT 

TGTCGAAGAA ACATGTTTGA CATCTATAAT 
ATTATTGAAT GTAAATAAAA TAGCACCAAT 

30 ATTAAATGGC TTTGTAAATA AATTTCTATA 

GAATTAATAT GGTGATTATA CGCCCTTAAT 
GTAAAACGAA AATCATCATT GATAGCATCG 

55 CATTAATTGC TGAATAAGTG TTAATAATAT 

TGT^ATAGCA CATATCGTTC TTTTTAATTC 
TTTAGATTCT GGTAAATGTA TATTTTGTGA 

40 

GAGATAcTGC GCAAGTGGTT GGCTACTGAT 
CAATTGTTTT TTT ACAGTTT CGGCAAATGG 
CTGAATTAAT GGTGGGTGTG TCGCCATCGT 

45 

ATAGTGCTCT TCGAATAAAG GTAGCATATG 
AAGTTCCGTG AAACCAATGT CTATATTCCC 
so TTCTAATAAG CTCGGTATGA CATGTGTATC 

TAGTAACATG TGGGATACGT CACTCTCATC 

55 



ATTTTTTAAA AAGATTAAAA ACATAGGTAG 1860 

GTTAGTCATT GTTCCCATAA TATCTTGGCC 1920 

CATTGAAATG TGTGGCGTAC GCTGTAAAAT 1980 

AACATCCATA .ATAGCGCCAA GTGAACCTAT 2040 

TGGTAATGAT AAAAAGTTCA TCGTTTCATA 2100 

GATTAATTCT GTTAAACCTA TACTCAAAAA 2160 

GAGTGTACGC ATATGCCAGC CTGTAACGAG 2220 

GGCAATGGTC ATGAGTAAGA ATAAATTAAT 2280 
GATTAATATG GCAATAGAAT TCAAGATTAA , 2340 

ACCAACCAAT AATACAGTTA ATAAGAACAA 2400 

CTTCTTTTCT ATAATATAAG CATCACTCGG 24 60 

GTGTGTGCGA AATGCCTCAG AATCTGCTTG 2520 

CGTTTCTCCA GCAAATTGAC CATTTAATAT 2580 

AT CACGATTA TTTTGTGCAT CTTTTGTAGG 2640 

TTGACCAATT GGTTTGTTGT AAAAGTTCTC 2700 

GAATGCGATG CAGAACAAAC CTAAAATTAT 2760 
TTTCAAAAAC AAAACCCCAA .TTCTATGAAT ■ ■ - - 282 0 

T T TT TATTTT CAAAGATATT ACTGCTAAGT 2880 

AATTACTTAA TGGAATGTAG ACGTTTTAGT 2940 

GCCAATATCA CTCTTTGTAT AAGGCTCCTT 3000 

AGTATGATCT AATTTTATAT CTATCCATGA 3060 

TGAAATGATG TAACCTTCTT TTTGACGAAG 3120 

TGTGTATACA TCTGATTTAG TAATCTTGCG 3180 

TGCCAAGCAA TAAATATGAC TATGCTCAAA 3240 

AATTGGATCG TCTGAAGGCG CATAT AAATG 3300 

TAATTGTTTG TGTTTACGTA TTTCTGGTGT 3360 

ATTTAATACG CTATTTATAA TTGTGTCATG 34 20 

ATTTTGTAAA TGAAACGTTT GGATAAGTGG 3480 

ATAGCCAATG TAGATACTTT TATTTTTAGT 3540 
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TTCATTAAAT AATAATTTCC CTTCAGATGT 
TAAAGACACA TTAAGTTCTT GTTCTAATAA 

5 

AATGTTTAGT TCAAGTGCTG TTTCGGAGAT 
TCTTAATTGT TTAATTTCCA TAGCGATATA 
ATTATAGCAA TATTATTGAT AAATGTTCTA 

10 

ATTGAACAGA TAAATTTTTT AGATTATAGT 
TCTAAAAAAG GGGTGTGCAT CATGCACAAT 
CGTGAGGAAC ATGATGCGTG TGGTATTGGT 

15 

CACGACATCA TTGATAAATC GCTTGAAATG 
GGCGCAGATG GCATCACTGG TGATGGCGCA 

20 TTCAAACAAC ATGTAACGGA CTTTGATATC 

TTTTTTTCCA AAGAACGCAT TTTAGGTTCT 
GAAGGCGAAG GGTTATCAAT TCTTGGTTAT 

25 GCTAAACATG TAG CAG AT AC GATGCCAGTC 

ATTGAAGATG TTGAAAAGCG TTTGTTTTTA 
CAGTGCGATT TAGAATTGTA TTTTACGAGG 

30 TGGTTACGAT CAG AC CAAAT TAAAAAACTA 

TCAAAGCTAG GGTTAGTGCA TTCGAGATTT 
GCACATCCTA ACCGTATGTT AATGCATAAT 

35 

AACTGGATGC GAGCACGCCA ACATAAATTA 
AAAGTGTTTC AAATTGTCGA TGAGGATGGT 
GAGTTCTTAT CGTTAGCCAT GGAGCCAGAA 

40 

TGGTTATATA ATGAAGCGAA TGATG CAAAT 
TTAATGGAAC CGTGGGATGG TCCTACAATG 
GCGCTTACAG ATAGAAATGG ATTACGTCCA 

45 

ATTGT C TTT T CATCTGAAGT GGGTGTTGTG 
GGTCAATTGA ATCCTGGAAA GTTATTGCTT 
50 AATAATGATT TAAAAGGTGC GATTGCTGGA 

CATAAAGTTG ACTTTGATTT TGAAAATATA 



GAGCGTAATA TTGCGTCCTT GCTTTTTAAA 3660 

TGTAATTTGA CGGCTTATCG CTGATTGAGC 3720 

ATGTTCTCTT TTAGCGACCT CGATAAAATA 3780 

GGCACCTCCA AAAATGAGTG TTTTGTAACT 3840 

TTTTTTAGAT GAATATCTTC TATTTTATAT 3900 

AATTATCATT AATAACTAAT ATCAGAATAT 3960 

GAGAAATTAA TTAAAGGCTT ATATGACTAT 4020 

TTTTATGCGA ATATGGATAA TAAAAGGTCT 4080 

TTGCGACGCT TAGATCACAG GGGCGGGGTC 4140 

GGTATTATGA CTGAAATACC TTTTGCATTT 42 00 

CCAGGTGAAG GTGAATATGC CGTGGGGTTA 4260 

GAACATGAAG TAGTTTTTAA AAAATATTTT 4 3 20 

CGTAATGTAC CAGTTAATAA AGATGCCATT 43 80 

ATTCAACAAG TGTTTATTGA TATTAGGGAC 4440 

GCGAGAAAAC AATTAGAGTT CTATTCGACT 4500 

TTATCACGCA AAACAATTGT ATATAAAGGT 4 560 

TATACAGATT TATCGGATGA TTTATATCAA 4620 

AGTACGAATA CATTCCCGAG TTGGAAAAGG 4680 

GGTGAGATTA ACACGATTAA AGGTAATGTA 4740 

ATCGAAACAT TATTTGGCGA GGATCAACAT 4800 

AGTGACTCTG CCATTGTAGA TAATGCGCTA 4B60 

AAGGCAGCGA TGTTACTCAT ACCTGAACCT 4920 

GTACGTGCGT TTTATGAATT TTATAGTTAT 4980 

ATTTCGTTCT GTAACGGTGA CAAACTTGGC 5040 

GGTCGTTATA CGATTACTAA AGATAACTTT 5100 

GACGTACCTG AAAGTAATGT TGCTTTTAAA 5160 

GTTGATTTTA AACAGAATAA AGTCATTGAA 5220 

GAATTACCAT ATAAAGCGTG GATTGATAAC 5280 

CAATATCAAG ATTCGCAATG GAAAGATGAG 5340 
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10 



CAGGAACTTG TAGAAGGTAA GAAGGATCCT ATCGGTGCAA TGGGATATGA TGCG GCAATT 54 60 

GCAGTGTTGA ACGAGCGACC AGAATCACTA TTTAATTACT TTAAACAGCT GTTTGCACAA 5520 

GTTACGAATC CACCAATTGA TGCGTATCGT GAAAAAATCG TAACGAGTGA ACTTTCTTAT 5580 

TTAGGTGGCG AAGGTAACTT ACTAGCACCT GACGAAACGG TTTTAGATCG TATTCAATTG 5640 

AAAAGGCCGG TATTGAATGA ATCACACTTA GCAGCGATTG ATCAGGAACA TTTTAAATTA 5700 

ACTTATTTAT CAACGGTATA TGAAGGGGAT TTGGAAGATG CGTTAGAAGC ATTAGGCCGA 5760 

GAAGCAGTGA ATGCTGTAAA GCAAGGCGCT CAAATTCTAG TGTTAGATGA TAGTGGATTA 5820 

16 GTTGATAGCA ATGGCTTTGC AATGCCGATG TTACTCGCAA TAAGTCATGT GCATCAATTA 5880 

CTTATTAAAG CAGATTTACG TATGTCTACA AGTTTAGTCG CTAAATCTGG TGAGACACGA 5940 

GAAGTGCATC ATGTTGCTTG TTTACTCGCA TATGGCGCGA ATGCAATTGT GCCATACCTA 6000 

20 GCGCAACGTA CAGTTGAACA ACTGACATTG ACAGAAGGGT TACAAGGCAC CGTTGTCGAT 6060 

AATGTTAAGA CATATACGGA TGTATTGTCA GAAGGTGTCA TTAAAGTAAT GGCTAAGATG 6120 

GGAATTTCGA CAGTGCAAAG TTATCAAGGG GCACAAATAT TTGAAGCGAT TGGCTTGTCT 6180 

25 CATGATGTGA TTGATCGTTA TTTTACTGGG ACACAGTCTA AGTTATCTGG TATTTCGATT 6240 

GATCAAATTG ATGCTGAAAA TAAAGCACGT CAACAAAGTG ATGATAATTA TCTTG CATCA 6300 

GGTAGTACAT TCCAATGGAG ACAACAAGGT CAACATCATG CTTTTAATCC GGAATCTATT 6360 

30 7 

TTCTTATTGC AGCACGCATG TAAAGAAAAT GACTATGCGC AATTTAAAGC ATACTCTGAA 6420 

GCGGTGAACA AAAATAGAAC AGATCACATT AGACATTTAC TTGAATTTAA AGCATGTACA 64 80 

CCGATTGACA TCGACCAAGT TGAACCGGTA AGTGACATTG TCAAACGCTT TAATACAGGG 6540 

GCGATGAGTT ATGGAT CGAT TTCAGCGGAA G CACATG AAA CGTTAGCACA AGCCATGAAC 6600 

CAA'EI'AGGTG GAAAGAGTAA TAGTGGTGAA GGTGGCGAAG ATGCAAAACG TTATGAAGTA 6660 

CAAGTTGATG GAAGCAACAA AGTAAGTGCG ATTAAACAAG TTGCTTCTGG GCGTTTTGGT 6720 

GTAACTAGTG ATTATTTACA ACATGCCAAA GAAATTCAAA TTAAAGTTGC GCAAGGTGCA 6780 

AAGCCTGGTG AAGGTGGTCA ATTACCTGGT ACTAAGGTAT ATCCGTGGAT TGCGAAGACA 6840 

45 AGAGGGTCAA CGCCAGGTAT CGGTCTGATT TCACCACCGC CACATCATGA TATTTATTCA 6900 

ATAGAAGATT TAGCGCAACT GATACATGAT TTGAAAAATG CGAATAAAGA TGCAGATATC 6960 

GCGGTAAAAT TAGTTTCGAA AACAGGTGTT GGTACCATTG CATCTGGGGT GGCAAAAGGA 7020 

SO TTTGCAGATA AAATTGTCAT CAGTGGTTAC GATGGTGGTA CAGGGGCTTC ACCCAAAACG 7080 

AGTATTCAGC ATGCCGGTGT TCCTTGGGAG ATTGGTTTAG CAGAAACACA TCAAACATTA 714 0 
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AAAGATGTAG CGTACGCATG TGCGCTTGGA 
TTAGTGGTGT TGGGCTGTAT TATGATGCGT 
5 GTTGCAACTC AAAACAAAGA TTTACGTGCT 

AATTTTATGC ATTTTATTGC ACAAGAATTA 
CGTGTAGAAG ACTTAGTTGG AAGAACTGAT 

10 

AATAGCAAAG CGGCTAGTAT TGATGTTGAA 
ACAAAAGAAA TTCAACAAAA TCATAATCTT 
GAAGTAACGA AGCCATATAT TOCTGAAGGG 

1S 

AATGAACAAC GTGATGTAGG GGTTATTACA 
GCAGGACTTC CTGAAAATAC AATTAATGTT 

2Q GCAGCATATG CACCGAAAGG CTTAATGATT 

GGTAAAGGAT TATCTGGTGG TACGGTCATT 
GAAATTATTG CTGGTAACGT CTCATTCTAT 

25 GGTAGTGCAG GAGAAAGATT CTGTATTAGA 

ATCGGCGACC ATGGATTAGA GTATATGACT 
GGTAAGAACT TCGGTCAAGG TATGAGTGGT 

30 GAAGCTTTTG TTGAAAATAA TCAACTAGAT 

GAAGAAAAAG CATTCATTAA GCAAATGCTG 
AGAGCGATTC ATGTGTTAAA ACATTTTGAT 

35 CCTAAAGATT ATCAATTAAT GATGCAAAAA 

GAAGATGAAG CGATGTTAGC TGCATTTTAC 
AAACCAGCCG TTGTGTATTA AGGAAAGGGG 

40 

TGAAGTATGA CAAACAGTAC TTAGGTGAAT 
AAGCATATGA ACAACGATTT ACTAAAGAAG 
ATTGTGGAAC GCCGTTTTGT CAAACCGGAC 

45 

CAATTGGAAA CTACATTCCT GAATGGAACG 
CTTATGAACG CTTAAGCGAA ACAAATAACT 
CACCATGCGA AAGTGCTTGT GTGATGAAGA 
TTGAACGCAC AATTATTGAT GAAGCTTTTG 

55 



GCGGAAGAAT TTGGATTTGC AACTGCACCA 7260 

GTATGCCATA AAGATACATG TCCAGTAGGA 7320 

TTATATAGAG GTAAAGCACA TCATGTTGTT 7380 

AGAGAAATTT TAGCATCTTT AGGTTTGAAA 7440 

TTATTACAAC GATCATCAAC ATTAAAAGCG 7500 

AAACTGTTAT GTCCTTTCGA TGGGCCAAAC 7560 

GAGCATGGAT TTGATTTAAC AAATTTATAT 7620 

CGTCGCTATA CAGGTAGCTT TACAGTAAAT 7680 

GGTAGTGAGA TTTCGAAACA ATATGGAGAA 7740 

TATACGAATG GTCATGCTGG TCAAAGTCTT 7300 

CATCATACTG GAGATGCGAA TGACTATGTT 7860 

GTCAAAGCAC CTTTTGAAGA ACGACAAAAT 7920 

GGTGCGACAA GTGGTAAGGC ATTTATTAAC 7980 

AATAGTGGTG TAGATGTTGT CGTTGAAGGT 8040 

GGTGGACATG TCATTAATTT AGGTGATGTA 8100 

GGTATTGCTT ACGTTATCCC GTCTGATGTA 8160 

ACGCTTTCGT TTACAAAGAT TAAACACCAA 8220 

GAAGAACATG TGTCACACAC GAATAGTACG 8280 

CGCATTGAAG ATGTCGTCGT TAAAGTTATT 834 0 

ATTCATTTGC ACAAATCATT ACATGACAAT 8400 

GATGACAGTA AAACAATCGA TGCTAAACAT 8460 

GAGATACGAT GGGTGAATTT AAAGGATTTA 8520 

TATCACTGGT AGACCGTTTG AAGCATCATA 8580 

ATGCCTCTAT CCAAGGTGCA CGATGTATGG 8640 

AACAGTATGG TAGGGAAACA ATAGGTTGTC 8700 

ACTTAGTGTA TCATCAAGAT TTTAAAACTG 8760 

TTCCTGACTT TACAGGGCGT GTATGTCCTG 8820 

TTAATAGAGA ATCGATTGCG ATTAAAGGTA 8880 

AAAATGGTTG GGTAGCG CCG AAAGTTCCGA 8940 
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CTGAAGAACT TAATCTACTA GGATATCAAG TAACTATTTA TGAACGTGCT AGAGAATCAG 9060 

GCGGTTTATT AATGTATGGT ATTCCGAATA TGAAACTTGA TAAAGATGTG GTTCGACGTC 9120 

5 GTATTAAGTT AATGGAAGAA GCGGGCATTA CTTTCATTAA TGGTGTTGAA GTCGGTGTTG 9180 

ATATTGATAA AGCAACGTTA GAATCTGAGT ATGATGCCAT TATATTATGT ACTGGTGCAC 9240 

AAAAAGGTAG AGATTTACCT TTAGAAGGAC GCATGGGTGA TGGTATACAT TTCGCTATGG 9300 

10 

ATTATTTAAC TGAACAAACG CAGTTGTTAA ATGGAGAAAT TGATGATATA ACAATAACTG 9360 

CAAAAGATAA GAATGTCATT ATCATTGGTG CTGGTGATAC AGGGGCAGAC TGTGTAGCGA 9420 

CAGCATTAAG AGAAAATTGT AAATCGATTG TTCAATTTAA TAAATATACG AAATTGCCAG 9480 

15 

AAGCAATTAC ATTTACAGAA AATGCATCAT GGCCTTTAGC AATGCCGGTG TTTAAAATGG 9540 

ACTATGCGCA CCAAGAGTAC GAAGCTAAGT TTGGTAAGGA ACCACGTGCA TATGGTGTTC 9600 

AAACAATGCG TTACGATGTT GACGATAAAG GACACATACG TGGTTTGT AT ACTCAAATTT 9660 

20 

TAGAGCAAGG CGAAAATGGT ATGGTCATGA AAGAAGGACC TGAAAGATTT TGGCCTGCTG 9720 

ACCTTGTATT ATTATCAATC GGCTTCGAAG GTACAGAACC AACAGTACCG AATGCTTTTA 9780 

25 ACATTAAAAC GGATAGAAAT CGAATCGTGG CGGATGATAC AAACTATCAA ACTAATAATG 9840 

AAAAGGTATT TGCTGCTGGA GATG CT AG AC GTGGTCAAAG TTTAGTTGTA TGGGCAATTA 9900 

AAGAAGGTAG AGGCGTAGCG AAAG CAGTAG ATCAGTATTT AGCTAGTAAA GTTTGTGTAT 9960 

30 AATCTTTGTA TGGAAATGGT GGTTACGTTG ACGTTGTGAC ATGCTGAATC GAGTTTGAAA 10020 

AAATCTAGTA TCTATCAACG TCACATGCCA TCTTTGTAAC CTAAAAACAA AGGTTTGTAA 10080 

GACAACAAAT AGATTAATTA TAAGTAGTGA TTTTTTACAT TCGTTTATAG GTCAACTGTA 10140 

35 GTGGAAGACA ATGATTTGTG GTAATCATGT AATGCTTAAA AACAATATTG ACTTTTACAG 10200 

AACOTTCATA TATGATAAAT ATTGTGTTTA GGAGGAATAC CCAAGTCCGG CTGAAGGGAT 10260 

CGGTCTTGAA AACCGACAGG GGCTTAACGG CTCGCGGGGG TTCGAATCCC TCTTCCTCCG 10320 

40 

CCATCAATAT TTATATTAAA TTCTATATAT AATGAAGGTA AGTGCTCAAA TTTTGAGTAT 10380 

TTACCTTTTT TATTTGTGTT TGAATGGCTC GTAATTTTTG ATAATAGAAA TGATAAGGCA 10440 

TTGAGATTGG AAGGGCATTT GGCTTGTGCA ATATACATAG CTAAATGTCT TTTTTGTTTT 10500 

45 

GTGAAATATG ATGGATGGCT TGTGTGGACA AGTTTGCTAT TTATAGATAT GCATTTTTCA 10560 

ATTTAGGAGT TGGCCATGCA TCTACACTTT ATAATGGTGA GAGCGTGGTG AGGTATTGTT 10620 

AATAACGCAA TTGT AG CG AG GAGTTATTGC TACATATGTC GTTATGGCTC ATTGATTTTC 10680 

50 

TGAAATGGCT ACCC CAGATA ATTGTGACAA AATAAAAATA TTTTGTTGAA AGCCTTTACA 10740 
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is 



25 



30 



35 



40 



45 



50 



TAAAAAGAGA 


AGATGTAAAA 


GCCATCGTAA 


CCGCTATTGG 


GGGAAAAGAA 


AATCTTGAAG 


10860 


CTGCAACGCA 


TTGTGTAACA 


CGATTACGTT 


TAGTGCTGAA 


GGATGAAAGT 


AAAGTTGATA 


10920 


AAGACGCATT 


AAGTAATAAC 


G CG TTGGTCA 


AGGGGCAGTT 


TAAAGCAGAC 


CATCAATATC 


10980 


AAATTGTCAT 


TGGTCCAGGA 


ACAGTCGATG 


AAGTGTATAA 


GCAGTTTATT 


GATGAAACAG 


11040 


GTGCTCAAGA 


AG CTTCGAAA 


GATGAAGCGA 


AACAAGCAGC 


TGCACAAAAA 


GGGAATCCAG 


11100 


TAGAACGTTT 


GATCAAATTG 


TtGGGGGATA 


TTTTTATACC 


AATATTACCT 


GCGATTGTOA 


11160 


CAGCTGGTTT 


GTTAATGGGA 


ATCAATAATT 


TACTTACAAT 


GAAAGGTTTA 


TTTGGTCCAA 


11220 


AAGCACTTAT 


TGAGATGTAT 


CCACAAATTG 


CTGATATTTC 


AAACATCATT 


AATGTGATTG 


11280 


CGAGTACGGC 


ATTTATTTTC 


TTACCAGCAT 


TAATTGGTTG 


GAGTAGTATG 


CGTGTATTTG 


11340 


GTGGTAGTCC 


GATTCTAGGC 


ATAGTCTTAG 


GTTTGATTTT 


AATGCATCCG 


CAATTAGTAT 


11400 


CTCAGTATGA 


TTTGGCAAAA 


GGGAATATTC 


CGACGTGGAA 


CTTATTTGGC 


TTAGAGATTA 


11460 


AGCAGTTGAA 


TTACCAAGGT 


CAAGTGTTGC 


CAGTtTTAAT 


TGCAGCTTAC 


GTTCTAGCTA 


11520 


AAATTGAAAA 


AGGATTAAAT 


AAAGTCGTTC 


ACGATTCGAT 


AAAAATGTTG 


GTCGTTGGAC 


11580 


CCGTAGCGCT 


TTTAGTTACT 


GGATTTTTAG 


CATTTATTAT 


CATTGGACCA 


GTTGCGTTAT 


11640 


TGaTTGGTAC 


AGGTATTACA 


TCTGGTGTTA 


GATTTATATT 


CCAACATGCA 


GGATGGCTTG 


11700 


GCGGAGCAAT 


ATATGGATTG 


TTATATGCAC 


CACTTGTAAT 


TACAGGACTA 


CAC CATATGT 


11760 


TTTTAGCAGT 


AGATTTCCAA 


TTGATGGGTA 


GCAGCTTAGG 


CGGTACGTAT 


TTATGGCCAA 


-11820 


TTGTTGCGAT 


TTCCAATATT 


TGTCAGGGCT 


CTG CAGCATT 


TGGAGCATGG 


TTTGTCTATA 


11880 


AACGTCGTAA 


AATGGTTAAA 


GAAGAAGGCT 


TGGCATTAAC 


ATCTTGTATT 


TCTGGTATGT 


11940 


TAGGTGTTAC 


TGAACCAGCC 


ATGTTCGGTG 


TGAACTTACC 


T CTG AAAT AT 


CCATTTATCG . 


12000 


CTGCGATATC AACGTCTTGT 


GTATTGGGGG 


CAATCGTTGG 


TATGAATAAC 


GT ACTTGGAA 


12060 


AAGTTGGTGT 


TGGTGGGGTG 


CCAGCATTCA 


TTTCAATTCA 


AAAAGAATTT 


TGGCCAGTAT 


12120 


ATCTTATTGT 


GACAG CTATT 


GCTATTGTTG 


TACCATGTAT 


ACTAACAATT 


GTGATGTCTC 


12180 


ATTTTAGTAA 


ACAAAAAGCG 


AAAGAAATTG 


TTGAAGATTA 


ATAAAATAAA 


AAAGGGGCGT 


12240 


TCGTTATTTG 


GACGTCCTTT 


ATTACGTTAT 


AAGGTGGTAA 


I rGTGTGTCG 


AAAGAAATAG 


12300 


ATTGGAGAAA 


ATCCGTTGTA 


TATCAAATTT 


ATCCTAAGTC 


GTTTAATGAT 


ACGACGGGGA 


123 60 


ATGGTATAGG 


AGATATCAAT 


GGAATTATAG 


AAAAATTGGA 


TTATATCAAG 


TTATTGGGTG 


12420 


TTGATT ATAT 


TTGGTTAAGA 


CCAGTGTATG 


AATCACCGAT 


GAATGATAAT 


GGCTATGATA 


12480 


TCAGCAATTA 


TTTAGAAATC 


aATGAAGACT 


TTGGAACGAT 


GGATGATTTT 


GaAAAGTTAA 


12540 
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CGACGGAGCA TGaATGGTTT AAAGAAGCCC GTAAATCTAA AGATAACCCy TATAGAGATT 12660 

ATTACTTTTT CAGATCATCT GAAGACGGGC CGCCAACAAA TTGGCATTCT AAATTCGGTG 12720 

5 GTAATGGATG GAAGTATGAT TCTGAGACAG ATGAATATTA TTTACATTTA TTTGATGTCA 12780 

GTCAAGCTGA TTTAAATTGG GATAATCCGG AAGTACGTCA ATCGTTATAT CGCATAGTCA 12840 
ATCATTGGAT AGACTTCGGC GTTGATGGTT TTCGATTTGA TGTCATTAAC TTAATTTCTA , 12900 

10 AAGGTGAATT TAAGGACTCT GACAAAATAG GTAAAGAATT TTATACGGAT GGTCCTAGAG 12960 

TGCATGAGTT TCTGCATGAA TTAAATCGTC AAACGTTTGG TAACACTGAC ATGATGACTA 13020 

TAGGAGAAAT GTCTTCGACG ACGATTGAAA ATTGTATTAA GTATACACAA CCAGAACGCC 13080 

15 

AAGAATTGAA TAGTGTTTTT AATTTTCATC ATCTAAAGGT TGATTATGTT GATGGTGAAA 13140 

AGTGGACAAA TGCGAgcTTG nATTTTCATA AGTTAAAGGA AATTCTGATG CAATGGCAAC 13200 

GAGGTATTTA TGACGGTGGC GGATGGAACG CGATTTTCTG GTGTAATCAT GATCAGCCAC 13260 

20 

GGGTAGTGTC TAGATTTGGT GATGATACGT CGGAAGAGAT GAGGATACAA AGTGCTAAAA 13 320 

TGTTAGCTAT CGCACTGCAT ATGTTGCAAG GGACGCCATA TATTTACCAA GGTGAAGAAA - 13 380 

2S TTGGTATGAC GGACCCACAT TTTACATCAA TAGCACAATA TCGTGATGTT GAATCGATTA 13440 

ATGCCTACCA TCAGTTGTTA AGTGAAGGGC ATGCTGAAGC GGATGTGTTA GCGATTTTAG 13 500 

GACAGAAGTC ACGAGACAAT TCGAGAACGC CTATGCAATG GAGTGATGAT GTTAATGCTG 13 56 0 

30 GATTTACAGC TGGTAAnCCT TGGATTGATA TTTCGGAAAA TT AT CAT CAG GTCAACGTTA 13620 

GACAAGCACT TCAGAATAAA GAGTCTATTT TCTATACGTA TCAAAAATTA ATACAATTAA 13680 

GACATACGCA TGATATTATT ACGTATGGAG ACATTGTGCC ACGTTTTATG GATCATGATC 1374 0 

35 ATTTATTTGT TTATGAACGT CATTATAAGA ATCAACAATG GCTAGTAATT GCGAATTTCT 13 800 

CAGCTATCGGC TGTTGATTTG CCAGAAGGAT TGGCTAGAGA AGGTTGTGTT GTGATTCAAA . 13 860 

CAGGCACAGT GGAAAATAAT ACGATAAGCG GGTTTGGTGC AATTGTAATC GAAACAAACG 13 920 

40 CGTAAAATAA ATTGAGTGGA TGCGTTTATA TGGCGAAACA AAAAAAGTTT ATGAAGATTT 13 980 

ATGAGGCGTT GAAAGAAGAT ATATTAAACG GGCAGATTCA ATATGGTGAA CAAATTCCGT 14040 

CTGAACATGA TTTGGTGCAA TTGTACCAGT CATCTCGAGA GACCGTGCGT AAGGCATTAG 14100 

45 

ATTTGTTGGC ATTAGACGGC ATGATTCAAA AGATTCATGG TAAAGGGTCA CTTGTCATTT 1416 0 

ATCAGGAGGT TACAGAGTTT CCATTTTCTG AACTTGTTAG TTTTAAAGAA ATGCAAGAAG . 14220 

AAATGGGCGT CGCATATTTA ACTGAAGTTG TTGTGAATGA GGTTGTTGAA GCGCATGAAG 14230 

50 

TTC CAGAAGT TCAACATGCT TTAAACATCA ATTCTAGTGA ATCACTCATT CATATTGTTA 14340 

55 
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10 



15 



20 



25 



45 



50 



TTGTTTCAGA TATAGGTAAT GATGTTGCGA GTGATTCT AT TTATGATTAT TTGGAAAAGG 14460 

TATTAAATCT . TAATATTAGT TATTCAAGTA AGTCTATTAC TTTTGAACCG TTTGATGAAC 14520 

AAGCATATCA ATTGTTTGGT GATGTATCGG TGGCTTATTC AGCAACAGTT CGAAGTATTG 14580 

TGTATTTAGA AAATACAATG CCGTTTCAAT ATAATATTTC AAAACATCTT . GCAAATGAAT 14640 
TTAAATTTAA TGACTTCTCA AGACGTCGTA TAAAGTAAAC AATGATATAA ATGATTTATA * 14700 

CTTGCAATTA ACTATTAAAA TATAGTAATA TATATCTTGC CGTGCTAGGT GGGGAGGTAG 14760 

CGGTTCCCTG TACTCGAAAT CCGCTTTATG CGAGGCTTAA TTCCTTTGTT 4 GAGGCCGTAT 14820 

TTTTGCGAAG TCTGCCCAAA GCACGTAGTG TTTGAAGATT TCGGTCCTAT GCAATATGAA 14880 

CCCATGAACC ATGTCAGGTC CTGACGGAAG CAGCATTAAG TGGATCATCA TATGTGCCGT 14 94 0 

AGGgTAGCCG AGATTTAGCT AACGACTTTG GTTACGTTCG TGAATTACGT TCGATGCTTA 15000 

GGTGCACGGT TTTTTATTTT TTAAATATTA AACCGATTAT TAAGAGTTGA AAATATATAA 15060 

TTATAGAAGC TACTTTCTTG AAGACAATTC AGCGTATTAT ACGTGGAACA TGTTTGTGGG 15120 

AAGTAGCTTT TTTATATGTG AAGTTTGATT CAAGTGAACT CGATGTGCAG TTTGAATGAT 15180 

TTTTGTGTCA ATGAAAAGTA AGAAGTTATA ATTTGATGAT AAAGAAATGA TGGTGAAATG 15240 
AGGGGGAGTA TCTTACAATA GAATTATTAA TGAGATACGT TATGATTATT GACAATCAAA • 15300 

TGCCTACGGA GGACATATGC AAATATATTT AAGTACTTTA ACAGAGTTAG ATTATG AT AA 153 60 

ATCTTTAAAT AGTATTGAAG AAAGTTTTGA TGATAATCCT GAAACGAGTT GGCAAGCACG 15420 

TGCGAAAGTA AAACATTTAA GAAAATCTCC TTGCTATAAT TTTGAATTAG AAGTAATAGC 15480 

GAAAAATGAA AATAACGATG TCGTTGGACA CGTTTTATTA . ATTGAAGTAG AAATTAATAG 15540 

TGATGATAAG ACGTATTATG GTTTGGCGAT TGCCTCTTTA TCAGTTCATC CTGAATTACG 15600 

TGGACAAAAA TTAGGTCGTG GCTTGGTTCA AGCAGTAGAA GAGCGTGCCA AAGCACAAGA 15660 

GTATAGTACG GTTGTTGTAG ACCATTGTTT TGACTACTTT GAAAAGTTGG GTTATCAAAA 15720 

TGGTGCTGAG CATGACATTA AATTAGAATC TGGTGATGGA CCGTTACTTG TAAAATATTT 15780 

ATGGGATAAT TTGACGGATG CACCACACGG AATCGTAAAA TTTCCAGAAC ATTTTTATTA 15840 

ATTGTTCAAT TAAGAAGTAA AGGTATTATC ATGCTATAAT GAGAGGTAAT TGTTTATGGA 15900 

GGTGCTAACT TGAATTATCA AGCCTTATAT CGTATGTACA GACCCCAAAG TTTCGAGGAT 15960 

GTCGTCGGAC AAGAACATGT CACGAAGACA TTGCGCAATG CGATTTCGAA AGAAAAACAG 16020 

TCGCATGCTT ATATTTTTAG TGGTCCGAGA GGTACGGGGA AAACGAGTAT TGGGAAAGTG 16080 

TTTGcTAAAG CAATCAACTG TCTAAATAGC ACTGATGGAG AACCTTGTAA TGAATGTCAT 16140 
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AATAATGGCG 


TTGATGAAAT 


AAGAAATATT 


A A i'"' A A A A 

AtaAwACAAAw- 


1 iAAATATGC 


At- LAAb iuM 


16260 


TCGAAATATA 


AAGTTTATAT 


TATAGATGAG 


GTGCACATGC 


TAACAACAGG 


luLl X X inni 






AGACGTTAGA- AGAACCTCCA 


GCACACGCTA 


TTTTTATATT 


ljL»L_AAtA»A<JA 




G AAC CACATA 


AAATCCCTCC 


AAGAATCATT 


TCTAGGGCAC 


AACGTTTTGA 


TTTTAAAGCA 






ATCAAATTGT 


TGAACGTTTA 


AAATTTGTAG 


CAGATGCACA 


ALAAA1 IUAA, 


16500 


X \j J. onnun x >>j 


AAGCCTTGGC ATTTAtcgCT AAAGCGTCTG AAGGGGGTAT 


G CGTGATGCA 


16560 


- X InnulAl XA 


TGGATCAGGC 


TATTGCATTT 


GGTGATGGTA 


CGTTAACATT 


GCAAGATGCG 


16620 


1 IvjAAIvj I (JA 


CAGGTAGCGT 


ACATGATGAA 


GCGTTGGATG 


ArTTflTTTflA 
ifcV* x x >j a x x vjn 


TGATATTGTA 


16680 


CAAGGTGACG 


TACAAGCATC 


TTTP A A A A A A 


TACPATPAGT 


TTATAAPAGA 


AGGTAAAGAA 


16740 


GTGAATCGCC 


TAATAAATGa 


TATYiATTTAT 


X X X X V-CI W\VJ 


ATAPGATTAT 

-TiXTWwVjxVX XnX 


GAATAAAACA 


16800 


TCTGaGaAaG 


ATACTGAGTA 

r\ X X VJ*Vw Irk 


X vwAVJvnU X Vj 


ATGAAPTTAfJ 
£\ X unnv X Inw 


AATTAGATAT 


GTTATATGAA 


16860 


ATGATTGATC 


x x a x i/v\i^n 


T A P ATT A P.TP 


t prt a 1 1 "i *pfT i w i * 

X vXrt J. X V_ X X 


TTAGTP.TP.a a 
X XAululvjtnn 


TCAAAACGTT 


16920 


CATTTTGAAG 


TflTTnTT A^TF 

X VJ X J. VJ X X X 


AAAATTAGPT 


GAGPAfSATTa 


agggtpaapp 


ACAAGTGATT 


16980 


GCGAATGTAG 


CTG AACCAG C 


APAAATTGPT 


TCATCGC CAA 


ACACAGATGT 


ATTGTTGCAA 


17040 


CGTATGGAAC 


AGTTAGAGCA 


AGAACTAAAA 


ACACTAAAAG 


CACAAGG AGT 


GAGTGTCGCT 


17100 


CCTGTTCAAA 


AATCTTCGAA 


AAAGCCTGCG 


AGAGGCATAC 


AAAAATCTAA 


AAATGCATTT 


17160 


TCAATGCAAC 


AAATTGCAAA 


AGTGCTAGAT 


AAAGCGAATA 


AGGCAGATAT 


CaAATTGTTG 


17220 


AAAGaTCATT 


GGCAAGAAGT 


GATTGATCAT 


GCCAAAAATA 


ATGATAAAAA 


ATCACTCGTT 


17280 


AGTTTATTG C 


AAAATTCGGA 


ACCTGTGGCG 


GCAAGTGAAG 


ATCACGTACT 


TGTGAAATTT 


17340 


(JAVsCjAAwAQiA 


TCCATTGTGA 


AATCGTCAAT 


AAAGACGACG 


AGAAACGTAG 


TAGTATAGAA 


17400 


nwlui XwXAX 


GTAATATCGT 


TAATAAAAAC 


GTTAAAGTTG 


TTGGTGTACC 


ATCAGATGAA 


17460 




TTCGAACGGA 


ATATTTACAA 


AATCGTAAAA 


ACGAAGGCGA 


TGATATGCCA 


17520 


aa^paapaars 


CACAACAAAC 


AGATATTGCT 


CAAAAAGCAA 


AAGATCTTTT 


O/^y '»tii^ A A A A 

CGGTGAAGAA 


17580 




TGATAGATGA 


AGAGTGATAC 


ATGACAAGCG 


ATATAATCGT 


J4 T^/ AT^A A TV* 

ATC»TATAATG 


17640 


AAAGAAAGAT 


CATTTTATTG 


ATAAATATTT 


ATTGATTTTC 


AAGGAGGAAA 


TGGAATATGC ■ 


17700 


GCGGTGGCGG 


AAACATGCAA 


CAAATGATGA 


AACAAATGCA AAAAATGCAA 


AAGAAAATGG 


17760 


CTCAAGAACA 


AGAAAAACTT 


AAAGAAGAGC 


GTATTGTAGG 


AACAGCTGGC 


GGTGGCATGG 


17820 


TTGCAGTTAC 


TGTAACTGGT 


CATAAAGAAG 


TTGTCGACGT 


TGAAATCAAA 


GAAGAAGCTG 


17880 


TAGACCCAGA 


CGATATTGAA 


ATGCTACAAG 


ACTTAGTGTT 


AGCAGCTACT 


AATGAAGCGA ; 


17940 
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TCCCTGGaAT GTGATCATAG ATGCATTATC 
TTATGAAATT GCCAGGCATT GGTCCAAAGA 

5 ATATGAAAGA AGACGATGTT GTTCAGTTTG 

TAACATATTG TAGCGTATGT GGTCACATTA 
ATAAGCAAAG AGATCGTTCA GTTATTTGTG 

10 TGGAAAAAAT GAGAGAATAC AAAGGTTTAT 

TGGATGGCAT TGGACCAGAA GATATTAATA 

ATGAAGTTAG CGAATTAATC TTAGCTATGA 

15 

TGTATATTTC TAGATTAGTT AAGCCTATAG 

TATCGGTAGG TGGCGATTTA GAGTATGCTG 

? GTAGAACAGA AATGTAATJtT CTTCTATTAA 

20 

AAGTCACAGT GTAATCATTG TGGCTTTTTT 
GCGGTGTGGG GGTGGTATGG TTTACCTAGT 
CAAGCCGTTG GTTGTGATTT GTTACTTCTA 

25 

TAGATCT ATG GTTATGGTGT GTTGGTGCTA 
CAAATGAAAT TCTTTTGTAA TTGAAATGAT 

GGTCTAAAGC TTATTAAATC AGCCTGTATA 

30 - - — 

TAAATTTATT TTTAATTTCT GGTAAAAAAA 
ATATGGTTAG AGAAAAATCT GTTTCTTGTT 

35 TTTTTAAGTT CGATTTTTAG GATAAGGGCG 

ACTGXTGTTA AGCAGTTTGA AAGCCTGTAT 
CTCAACTTAA GAAATAACTT GAATTACTAA 

40 AAATGTTAAT AAAATGTATA ATTAATTCTT 
AATGACAATA TGTCAACGTT AATTCCAAAA 
GTATTTATGA GGTAATCAAA CATCATAATT 

45 GAACGCTGGC GGCGTGCCTA ATACATGCAA 
CTGATGTTAG CGGCGGACGG GTGAGTAACA 
ACTTCGGGAA ACCGkAGCTA ATACCGGATA 

SO 

AGACGGTCTT GCTGTCACTT ATAGATGGAT 



CAGAACCTAT ATCAAAACTT ATTG AT AG CT 18060 

CAGCCCAACG TCTGGCTTTT CATACCTTAG 18120 

CCAAAGCATT AGTAGATGTT AAGAGAGAAT 18180 

CTGAAAATGA TCCATGTTAT ATTTGTGAAG 18240 

TTGTGGAAGA TGACAAAGAT GTCATAGCTA 18300 

ATCACGTTTT ACATGGGTCT ATTTCGCCTA 18360 

TTCCTTCATT GATTGAACGC TTGAAAAACG 18420 

ACCCGAACTT AGAGGGGGAA TCTACAGCCA 18480 

GTATCAAAGT GACGAGATTA GCACAAGGGT 18540 

ACGAAGTAAC ATTATCTAAA GCAATCGCAG 18600 

ACATTTTTGA TTTTAATACT ATAGTAAGAA 18660 

TATGGTGTGG TGTGATGTAC TACTTTATTT 18720 

TTTACTGAGG GATGGGTAAT CTTTAGGAAG 18780 

ATAGTAATGA TGTGAATTGG ATTATCGAAT 18840 

TTAATTTGAT AAATGCGGTT AATGACTATG * 18900 

AGATGCTGGC TTAGTAAGTT GTACTTCTTT 18960 

GCGGTGTTTT. GAGAGATTAT TTAAAACTTG 19020 

TAACGTT CTG TTTTGCGTTT TTTTTGATTG 19080 

- CTAAAAAACG TACTATTTAT . AAGTGGGGAT 1914 0 

TTCAGTACAG ATGACAAAGG TGTAATTTTT 19200 

AGTATTTATT TGTTGAGGCA AACAAAACAA 19260 

CGAAAATTAA TTTTAAAAAG TTATTGACTT 19320 

GTCGGTAAGA AAAATGAACA TTGAAAACTG 19380 

AACGTAACTA TAAGTTACAA ACATTATTTA 19440 

TTTATGGAGA GTTTGATCCT GGCTCAGGAT 19500 

GTCGAGCGAA CGGACGAGAA GCTTGCTTCT 19560 

CGTGGATAAC CTACCTATAA GACTGGGATA 19620 

ATATTTTGAA CCGCATGGTT CAAAAGTGAA 19680 

CCGCGCTGCA TTAGCTAGTT GGTAAGGTAA 19740 
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GAGACACGGT 
gCtGaCGGAG 
GGGAAGAACA 
GGCTAACTAC 
TGGGCGTAAA 
GTGGAGGGTC- 
GTAGCGGTGA 
TGTAACTGAC 
CCACGCCGTA 
AACGCATTAA 
GGGGACCCGC 
CAAATCTTGA 
GACAGGTGGT 
CGAGCGCAAC 
GTGACAAACC 
TACACACGTG 
CATAAAGTTG 
CTAGTAATCG 
CGTCACACCA 
CGTCGAAGGT 
GCGQCTGGAT 
ATAACGTGAC 
TAAAGTGATA 
TACATTGAAA 
AAAGAGTTTT 
CACAAGATTA 
TGACTTATAA 
GGCACTAGAA 
AGCTTTGATC 



CCAGACTCCT 
CAACGCCGCG 
TATGTGTAAG 
GTGCCAGCAG 
GCGCGCGTAG 
ATTGGAAACT 
AATGCGCAGA 
GCTGATGTGC 
AACGATGAGT 
GCACTCCGCC 
ACAAGCGGTG 
CATCCTTTGA 
GCATGGTTGT 
CCTTAAGCTT 
GGAGGAAGGT 
CTACAATGGA 
TTCTCAGTTC 
TAGATCAGCA 
CGAGAGTTTG 
GGGACAAATG 
CACCTCCTTT 
ATATTGTATT 
TTGCTTATGA 
ACTAGATAAG 
AAATAAGCTT 
ATAACGCGTT 
AAATGGTGGA 
GCCGATGAAG 
CAGAGATTTC 



ACGGGAGGCA 
TGAGTGATGA 
TAACTGTGCA 
CCGCGGTAAT 
GCGGTTTTTT 
GGAAAACTTG 
GATATGGAGG 
GAAAgCGTGG 
GCTAAGTGTT 
TGGGGAGTAC 
GAGCATGTGG 
CAACTCTAGA 
CGTCAGCTCG 
AGTTG CCATC 
GGGGATGACG 
CAATACAAAG 
GGATTGTAGT 
TGCTACGGTG 
TAACACCCGA 
ATTGGGGTGA 
CTAAGGATAT 
CAGTTTTGAA 
AAATAAAGCA 
TAAGTAAAAT 
GAATTCATAA 
TAAATCTTTT 
AACATAGATT 
GACGTTACTA 
CGAATGGGGA 



GCAGTAGGGA 
AGGTCTTCGG 
CATGTTGACG 
ACGTAGGTGG 
AAGTCTGATG 
AGTGCAGAAG 
AACACCAGTG 
GGATCAAACA 
AGGGGGTTTC 
GACCGCAAGt 
TTTAATTCGA 
GATAGAGCCT 
TGTCGTGAGA 
ATTAAGTTGG 
TCAAATCATC 
GGGAGCGAAA 
CTGCAACTCG 
AATACGTTCC 
AGCCGGTGGA 
AGT CGTAACA 
ATTCGGAACA 
TGTTTATTTA 
GTATGCGAGC 
ATAGATTTTA 
GAAATAATCG 
TATAAAAGAA 
AAGTTATTAA 
ACGACGATAT 
AACCCAGCAT 



ATCTTCCGCA 
ATCGTAAAAC 
GTACCTAATC 
, CAAGCGTTAT 
TGAAAGCCCA 
AGGAAAGTGG 
GCGAAGGCGA 
GGATTAGATA 
CGCCCCTTAG 
TGAAACTCAA 
AGCAACGCGA 
TCCCCTTCGG 
TGTTGGGTTA 
GCACTCTAAG 
ATGCCCCTTA 
CCGCGAGGTC 
ACTACATGAA 
CGGGTCTTGT 
GTAACCTTTT 
AGGTAGCCGT 
TCTTCTTCAG 
ACATTCAAAT 
GCTTGACTAA 
CCAAGCAAAA 
CTAGTGTTCG 
CGTAACTTCA 
GGGCGCACGG 
GCTTTGGGGA 
GAGTTATGTC 



ATGGGCGAAA 

TCTGTTATTA 

AGAAAGGGAC 

CCGGAATTAT 

CGGCTCAACC 

AATT CCATGT 

CTTTCTGGTC 

CCCTGGTAGT 

TGCTGCAGCT 

AGGAATTGAC 

AGAACCTTAC 

GGG AC AAAGT . 

AGTCCCGCAA 

TTGACTGCCG 

TGATTTGGGC 

AAGCAAATCC 

GCTGGAATCG 

ACACACCGCC 

AGGAGCTAGC 

ATCGGAAGGT 

AAGATGCGGA 

ATTTTTTGGT 

AAAGAAATTG 

CCGAGTGAAT 

AAAGAAGACT 

TGTTAACGTT 

TGGATGCCTT 

GCTGTAAGTA 

ATGTTATCGA 



19860 
19920 
19980 
. 20040 
20100 
20160 
20220 
20280 
20340 
20400 
20460 
20520 
20580 
20640 
20700 
20760 
20820 
20880 
20940 
21000 
21060 
21120 
21180 
21240 
21300 
21360 
21420 
21480 
21540 
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GAGGAAGAGA AAGAAAATTC GATTCCCTTA GTAGCGGCGA GCGAAACGGG AAGAGCCCAA 21660 

ACCAACAAGC TTGCTTGTTG GGGTTGTAGG ACACTCTATA CGGAGTTACA AAGGACGACA 21720 

TTAGACGAAT CAT CTGG AAA GATGAATCAA AGAAGGTAAT AATCCTGTAG TCGAAAATGT 21780 

TGTCTCTCTT GAGTGGATCC TGAGTACGAC GGAGCACGTG AAATTCCGTC GGAATCTGGG 21840 

AGGACCATCT CCTAAGGCTA AATACTCTCT AGTGACCGAT AGTGAACCAG TACCGTGAGG 21900 

GAAAGGTGAA AAGCACCCCG GAAGGGGAGT GAAATAGAAC CTGAAACCGT GTGCTTACAA 21960 

GTAGTCAGAG CCCGTTAATG GGTGATGGCG TGCCTTTTGT AGAATGAACC GGCGAGTTAC 22020 

GATTTGATGC AAGGTTAAGC AGTAAATGTG GAGCCGTAGC GAAAGCGAGT CTGAATAGGG 22080 

CGTTTAGTAT TTGGTCGTAG ACCCGAAACC AGGTGATCTA CCCTTGGTCA GGTTGAAGTT 22140 

CAGGTAACAC TGAATGGAGG ACCGAACCGA CTTACGTTGA AAAGTGAGCG GATGAACTGA 22200 

GGGTAGCGGA GAAATTCCAA TCGAACCTGG AGATAGCTGG TTCtCTCCGA AATAGCTTTA 22260 

GGGCTAGCCT CAAGTGATGA TTATTGGAGG TAG AG CACTG TTTGGACGAG GGGCCCCTCT 22320 

CGGGTTACCG AATTCAGACA AACTCCGAAT GCCAATTAAT TTAACTTGGG AGTCAGAACA 223 80 

TGGGTGATAA GGTCCGTGTT CGAAAGGGAA ACAGCCCAGA CCACCAGCTA AGGTCCCAAA 22440 

25 

ATATATGTTA AGTGGAAAAG GATGTGGCGT TGCCCAGACA ACTAGGATGT TGGCTTAGAA 22500 

GCAG CCATCA TTTAAAGAGT GCGTAATAGC TCACTAGTCG AGTGACACTG CGCCGAAAAT 22560 

GTACCGGGGC TAAACATATT AC CGAAGCTG TGGATTGTCC TTTGGaCAAT GGtAGGAGAG 22620 

30 

CGTTCTAAGG GCGTTGAAGC ATG AT CGTAA GGACATGTGG AGCGCTTAGA AGTGAGAATG 226 80 

CCGGTGTGAG TAGCGAAAGA CGGGTGAGAA TCCCGTCCAC CGATTGACTA AGGTTTCCAG 2274 0 

3S AGGAAGGCTC GTCCGCTCTG GGTTAGTCGG GTCCTAAGCT GAGGCCGACA GcGTAGGCGA 22800 

TGGATAACAG GTTGATATTC CTGTACCACC TATAATCGTT TTAATCGATG GGGGGACGCA 22860 

tAGGATAGGC GAAg c GTGcG ATTGGATTGC ACGTCTAAGC AGTAAGGCTG AGTATTAGGC 22920 

40 AAATCCGGTA CTCGTTAAGG CTGAGCTGTG ATGGGGAGAA GACATTGTGT CTTCGAGTCG 22980 

TTGATTTCAC ACTGCCGAGA AAAGCCTCTA GATAGAAAAT AGGTGCCCGT ACCGCAAACC 23 040 

GACACAGGTA GTCAAGATGA GAATTCTAAG GTGAGCGAGC GAACTCTCGT TAAGGAACTC 23100 

45 GGCAAAATGA CCCCGTAACT TCGGGAGAAG GGGTGCTCTT TAGGGTTAAC GCCCAGAAGA 2316 0 

GCCGCAGTGA ATAGGCCCAA GCGACTGTTT ATCAAAAACA CAGGTCTCTG CTAAACCGTA 23220 

AGGTGATGTA TagGGcTGAC GCCTGCCCGG TGCTGGAAGG TTAAGAGGAG TGGTTAGcTT 23280 

50 

CTGCGAAgCT ACGAATCGAA GCCCCAGTAA ACGGCGGCCG TAACTATAAC GGTCCTAAGG 23340 
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TGTCTCAACG AGAGACTCGG TGAAAT CAT A GTACCTGTGA AGATGCAGGT TACCCGCGAC 23460 

AGGACGGAAA GACCCCGTGG AGCTTTACTG TAGCCTGATA TTGAAATTCG GCACAGCTTG 2 3520 

TACAGGATAG GTAGGAGCCT TTGAAACGTG AGCGCTAGCT TACGTGGAGG CGCTGGTGGG 2 3 580 

ATACTACCCT AGCTGTGTTG GCTTTCTAAC CCGCACCACT TATCGTGGTG GGAGACAGTG 2 3640 

TCAGGCGGGC AGTTTGACTG GGGCGGTCGC CTCCTAAAAG GTAACGGAGG CGCTCAAAGG 2 3700 

TTCCCTCAGA ATGGTTGGAA ATCATTCATA GAGTGTAAAG GCATAAGGGA GCTTGACTGC 23760 

GAGACCTACA AGTCGAG CAG GGTCGAAAGA CGGACTTAGT GATCCGGTGG TTCCGCATGG 23 820 

AAGGGCCATC GCTCAACGGA TAAAAGCTAC CCCGGGGATA ACAGGCTTAT GTCCCCCAAG 23880 

AGTTCACATC GACGGGGAGG TTTGGCACCT CGATGTCGGC TCATGGCATC CTGGGGCTGT 23940 

AGTCGGTCCC AAGGGTTGGg CTGTTCGCCC ATTAAAGCGG TACGCGAGCT GGGTTCAGAA 24000 

CGTCGTGAGA CAGTTCGGTC CCTATCCGTC GTGGGCGTAG GAAATTTGAG AGGAGCTGTC 24060 

CTTAGTACGA GAGGACCGGG ATGGACATAC CTCTGGTGTA CCAGTTGTCG TGCCAACGGC 24120 

ATAGCTGGGT AGCTATGTGT GGACGGGATA AGTGCTGAAA GCATCTAAGC ATGAAGCCGC 24180 

CCTCAAGATG AGATTTCCCA ACTT CGGTTA TAAGATCCCT CAAAGATGAT GAGGTTAATA 24 240 

GGTTCGAGGT GGAAGCATGG TGACATGTGG AGCTGACGAA TACTAATCGA TCGAAGACTT 24300 

AATCAAAATA AATGTTTTGC GAAGCAAAAT CACTTTTACT TACTATCTAG TTTTGAATGT 24360 

ATAAATTAGA TTCATATGTC TGGTGACTAT AGCAAGGAGG . TCACACCTGT TCCCATGCCG 24420 

AACACAGAAG TTAAGCTCCT TAGCGTCGAT GGTAGTcGAA CTTACGTTGC GCTAGAGTAG 244 80 

AACGTTGCCA GGCAAAAAAT GGATGCGATG AGCCGCATTG AGAGCGCAAG GTCTCTTTTT 2454 0 
35 TTTATGTCTA AAACGTCAAA ATAAAAAGCA AACACAAAGA AAAATGGCTT GGCGAAGTGA ' 24600 

AAACGTTTGA ATCTGACGAA ACGAGAAAAG ArCGCAACGA GTTTAGTAGA GCTAAATGAG 2466 0 

TAAGyGAGAG CCGAAGrAGA GGAAAGAAGC AAGCGATTGT CACAAGTCAA GAAAGGTTCT 24720 

40 TAGCGASGAT GGTAGCCAAC TTACGTTCCG CTAGAGTAGA ACTGGAAATG ATAATTTAAT 24780 

AATGTACACT TTGGATTGTC TAAGTATGTA CAACTTTAAT TTTGTGTTTA TATAAATTTA 2484 0 

AAATGATATC ATCGAAAACA AAATATTGTA T AAAT AG AGA AGAGCAGTAA GACGGTATCT ' 24900 

45 „ 

AATTGAAAAT GATCTTACTG CTCTTTTATA TACTTTATTG AAATACAAAA AGGAAATTAA 24 960 

TTATTATACA ATAGACAAGC TATTGCATAA GTAACACTAA CTTTTATCAA AGAAGTGTTA 25020 

CTTTATAATT AATGATTTTA TTAGAGCGTC TACATGCGGT TTTAAAGCAT CATCGTCTAT 25080 

ACCGCCAAAG CCTAATATAA ATTTAGGGGT TTTCTTATAG TCTTGATCAT CATCAAAATT 25140 
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TCCATTTTTT ACTG TAATTG TAAAATGCAT 
CTCTTTGTAA GGTTTCAATC TTTTTAAAAT 

5 CAtTTTATTT AAATGCCTTT CAAAACCACC 

CATATGAACA GGTACAGTGT TGCCTTCAAT 
AGAATAGGGT AACACCATAT ATGCAACTCG 

10 ACTGATATAA ATCACTTTTT CTCCTCTTGA 

GCCGAAATAT CTAAACTCGG AATCATAATG 
TTGAGCCCAT TGTATTAATT GAGTTCGTTT 

IS 

TTGATGGGAA GGCGTTATAT ATACTATATT 
TATTCCATTA TCTTCAACTT CAATTTGTTC 
TTTGATTGGT GGATAACTAG GTTTTTCGAT 

20 

TAATTGATTT ACTAATTGTT CGGTAGATGA 
TACGGCACGA TTAGTAAATA AATAAAATGC 
AAAATGTCCT CTACGTAATT GATTTAAATG 

2S 

TCTGAAAAGT TCTATAGGGA AATGTTTCGT 
ATAAGGTTCA TCACTCGCTT TTGGTTTATA 
TTGATTGTTT AAAATTGTTA AAGATTCAAT 

30 

TGAATAAATG TAACCTTCGT CTAATAGAAG 
AATAGATAAA TGTTTGCTTA ATTGTCTTTT 

35 ACCTTCAATT ATTTGTTTTT TTAATTTTTC 

TTTT&TAACT GACCTCCTAA ATTTATCTTA 
ATTACAATGT ATTTAATCAA CTTGAAAAGG 

40 AT CAGACAGA GTCAAAAGAG GTATGGCTGA 

CGTTAAfGCT GAGCAAGCAA GAATTGCAGA 
AGAACGAGTA CCTTCTGATA TTAGAGCTGC 

45 AATTGTAGAA GAAGTAATGA ATGCTGTTTC 

TCATATCACT GAAGCAAGAG TATTAGAGGC 
AGTGTTAACA CCAGCAGATG AGGAATATCA 

SO 

TGTATGTGGA TGTCGTAATT TAGGTGAAgm 

55 
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ACCCGTTTCA GCACCTTGAA TATCAAGCTG 25260 

ATAGGTTAGT TTTCTACGAT AAATTCGTCT 25320 

GGAAGATATA AACGTTGCAA TAAGGTTTTG 25380 

GTGATTTTGA GAATGATATT TTTTCATTAT 25440 

ACAGCTAGGA AAAATAGACT TTGAAAATGT 25500 

ATATAGACCT TGAATTGCTG GAATGGGTTT 25560 

AT CTTCT ATA ATAAATCGTT CTTCTTTTTC 25620 

TTTTAAGTCC ATCACATATC CAGTTGGAAA 25680 

TTTTTGTGAT TTAATAACTT CATCTACGTT 25740 

ATATTCAACT TGTTTTTTAT CTAAAATATT 25800 

AATAAATGTT GAAGTATAAA GTAAATCGAC 25860 

GCCAATTATA ATTTGATTAG GATCACAAAT 25920 

CAGTTGAAAC CGCAAATGTA ATTCTCCTTG 25980 

ATTTGTATCA TAAAGATCTT TGGAATACTT 26040 

ATCTATTTCA TCCAAATTAA AAGCATAATC 26100 

TGAATCATCA TCAAAAAGAG AGGGGATAGG 26160 

TTCGGACACA AAATATCCAG • AGCGAGGTCT 26220 

TTGATATGCA TGCTCTACGG TTGTTTGGCT 26280 

AGAATAAAAT TTATCGCCTT CTTTAAATTG 26340 

ATAAAGTTGA TGGTATAAAG TGTTTTTCAA 26400 

TTTTGTACCT TTTTAAATAT CAGTTTATAC 26460 

GGTTTTATGT ATAATGAGTA AAATTATTGG 26520 
AATGCAAAAA GGCGGCGTTA TTATGGATGT ■ 26580 

AGAAGCTGGC GCGGTAgCAG TTATGGCATT 26640 

TGGTGGTGTT GCACGTATGG " CAAACCCTAA 26700 

TATTCCAGTC ATGGCTAAAG CACGTATTGG 26760 

GATGGGTGTT GACTATATTG ATGAATCAGA 26820 

CTTAAGAAAA GATCAATTTA CAGTACCATT 26880 

TGCGCGTAGA ATTGGTGAAG GTGCTGCTAT 26940 
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ACAAGTTAAT TCAGAAGTTA GTCGATTGAC 
TGCGAAAGAT ATCGGTGCGC CTTATGAAAT 

5 

ACCGGTAGTT AACTTTGCAG CTGGTGGCGT 
GGAATTAGGT GCTGACGGTG TATTCGTTGG 
AAAATTTGCT AAAGCAATTG TTCAAGCAAC 

10 

AAGATTAGCA AGTGAACTTG GCACTGCTAT 
AGAAGAACGT ATGCAAGAGC GTGGTTGGTA 

15 AGGTGCAGTA CGTGAACATA TTAGACATAT 

TAAAAAAGTT GAACAATTAG AAGAAATCGA 
AACGTTACGT CGATTAATGA ATTTATATGG 

20 ACCTATGTTT GGTACATGCG CAGGATTAAT 

AGGATACCTT AACAAGTTGA ATATTACTGT 
CAGCTTTGAA ACAGAATTAG ATATTAAAGG 

25 AAGAGCCCCA CATATTGAAA AAGTAGGTCA 

GAAAATTGTA GCTGTTCAGC AAGGTAAATA 
AGATGACTAT AGAGTAACTG ATTACTTTAT 

30 

TGTATGCTAA ATCAACGAAT TATTGATATT 
TCAAACTTAG CTTTGGAGGA GTTATTTTTT 
GCTATACATA AGAAAAAAAC CCTTCAAAGA 

35 

TAATTCGATG TTGATGTATT TGTTAAATAA 
ATACrAGTGT tGCACCGAAT AATAATTTCA 

40 TGTCATTAAG TGATTTAATC GCACCTGAAA 

ATACTAAGAA TACAGATGTA ACACCTTTTG 
GTG CTTG CAT TGCTACAAAT TCGTTAGATA 

45 GAACTG CATC TTGCCATGGC ACACCGACTA 

TTAATGTTTG GAAATCCCAA GAAATAGCGC 
CAATTCCATT TAATAGAGCG ATAATGGCAA 

50 

CAGCTACTTT AAATCCATCT AAAATATATT 
TTTCTTCAGT TTCTTCAACT AATAATTTGT 

55 



TGTAATGAAT GATGATGAGA ' TTATGACTTT 27060 

TTTAAAACAA ATTAAAGACA ATGGTCGTTT 27120 

TGCGACTCCT CAAGATGCTG CTTTAATGAT 27180 

ATCAGGTATT TTTAAATCAG AAGATCCAGA . 27240 

AACACATTAC CAAGACTATG AACTAATTGG ' 27300 

GAAAGGTTTA GATATCAATC AATTATCATT 27360 

AGATATGAAA ATAGGTGTAT TAGCATTACA 27420 

TGAATTAAGT GGTCATGAAG GTATTGCAGT 27480 

GGGCTTAATA TTACCTGGTG GGGAGTCTAC 27540 

ATTTAAAGAG GCTTTACAAA ATTCAACTTT 27600 

AGTTCTAGCG CAAGATATAG TTGGTGAAGA 27660 

ACAACGAAAC TCATTCGGTA GACAAGTTGA 27720 

TATCGCTACA GATATTGAAG GTGTCTTTAT 27780 

AGGCGTAGAT ATCCTATGTA AGGTTAATGA 27840 

TTTAGGCGTA TCATTCCATC CTGAATTAAC 27900 

TAATCATATT GTAAAaAAAG CATAGCTTAA 27960 

TATAGATTTG TTGAGAAGAA AATATCTCCT 28020 

ATGTCAAAAT TAAAAATGAT AAAAAATAAA 28080 

GACTGAGAAT AGTCAAAATT TTGAAGGGGT 2 8140 

AGAATCcAGC GATTGCAGCT GAAATGAAAG 28200 

AACCAAAGCG GGCAACTGTA TCTCCTTTTT 28260 

TAATACCGAT AGAGCTAAAG TTAGCAAATG 28320 

CGTGTTCAGA TAAATCACTA AGTTTACCAA 28380 

ATAGTTTTGT CGCCATAACT GAACCGGCTT 28440 

AGAATGCAAA TGGTGCAAAG ACAAAACCAA 28500 

CACCTGAAAC TGTACTAAAG ATATTGCTTA 2 8560 

TGTATCCGAT TAACATTGCG GCTACAATGA 28620 

CTCCTAGCAT TTCGAAGAAT GATTGTTGTC 2 8680 

CATCTTCTTC ATTAACTTTA TAAGGGTTAA 28740 
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10 



is 



20 



25 



35 



40 



45 



SO 



TAGGTTCAAT TAAGGTAAAG TATGCACCGA TAATTGAAGC AGAAACAGTC GACATTGCTG 28860 

AAGCTGTTAA TGTGTATAAA CGTTGCTTAG GTATGTATGG TAATTGTTTT TTAATTGAAA 2892 0 

TAAATACTTC AGATTGTCCC AAAATTGCTG CAGCAACTGC ATTGTATGAT TCTAAACGTC 28980 

CCATACCATT AATTTTAGAA ATTAAGAATC CTAAAACATT AATGATTAAA GGTAAAATCT 29040 

TTGTGTATTG AAGGATACCG ATAATCGCTG AAATAAATAC GATAGGTAAT AATACACTGA 29100 

AGAAGAATGG TGGTTGCTTA GGATCGATAT ATTGAATACC ACCGAATACA AAGTTAACAC 29160 

CATCTGCTGC TTTTAATAAT AAGTAGTTAA AACCGTTTGA AATACCACCA ATAACCTTGA 29220 

TTCCCATTGT AGTTTTAAGC AAGATAAATG CAAAGATAAG CTGAATTGCA AGTAAAATTC 29280 

CTACATATTT CCAGCGAATA TTTTTCCTGT CTGAGCTAAA TAGAAACGCA AGTGCTAAAA 29340 

AGAAGATAAT TCCGATAATC CCAATTAGAA TATGCATATA TTTCTCATTC CTTTAGTTTT 29400 

TTCTACaATc TATCATACAA TAAAATGGAA GGGCTAACAT CATAAATTTT TGAAAATATA 2 9460 

AAAACAAATT AATTGAAAAA GGTCAAAATA GGTCATATAA TATAGTCAAA GAAGGTCAAA 29520 

AAGGGGTGAT ATACATGCAC AATATGTCTG ACATCATAGA ACAATAaTCA AACGTTTATT 29580 

TGAAGAGTCG AATGAAGATG TCGTTGAAAT TCAGAGAGCG AATATCGCAC AGCGTTTTGA 29640 

TTGCGTACCA TCACAATTAA ATTATGTAAT CAAAACACGA TTCACTAATG AACATGGTTA 29700 

TGAAATCGAA AGTAAACGTG GTGGTGGTGG TTACATCCGA ATCACTAAAA TTGAAAATAA 29760 

AGATG CAACA GGTTATATTA ATCATTTGCT TCAGCTGATT GGACCTTCTA TTTCTCAACA 29820 

ACAAGCTTAT TATATTATTG ATGGGCTTTT AGATAAAATG TTAATAAATG AACGTGAAGC 298 80 

T AAAATG ATT CAAGCAGTTA TTGATAGAGA AACGCTATCA ATGGATATGG TTTCTAGAGA 29940 

TATTATTAGA GCAAATATTT TAAAACGTTT GTTACCAGTT ATAAATTATT ACTAAATGAA 30000 

ATGAGGTGTT GAAGTGCTTT GTGAAAATTG TCAACTTAAT GAAGCGGAAT TAAAAGTTAA 30060 

AGTTACAAGT AAAAATAAAA CAGAAGAAAA AATGGTGTGT CAAACTTGTG CTGAGGGGCA 3 0120 

CCATCCGTGG AATCAAGCTA ATGAACAACC TGAaTATCAA GAACATCAAG ATAATTTCGA 3 0180 

AGAAGGATTT GTTGTTAAGC AAATTTTACA ACATTTAGCT ACGAAACATG GAATTAATTT 30240 
TCAAGA . 30246 
(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14333 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 





TATTCCCCCA 


TCGGTTTATT 


AAATCGTCCA 


TTTCAATACT 


GTTTTTCCCC 


AAGATGTCGA 


60 


5 


TAAATCCATT 


TCAAACGCTT 


GGACGATATC 


TTGCATCGTA 


CATACATTAA 


TTTCATGTCC 


120 




TTTTAATAAT 


GCTAACTTTT 


CAACTATGTC 


TGGGTACTTA 


CGATATAAAT 


CAACAACTTG 


180 


to 


CTCAAAATCT 


TTAGAGCCGC 


TTCGACTACT 


ACCAATCAAC 


GTTAAT CCTT 

x x x \^ x x 


TTTCAAGTAC * 


240 


TAATCGTGTA 


TTCACTTCCA 


CGGGTAATTC 




AACAAAGCAA 


TACTGCCTTC 


300 




TGGTGAAATA 


TGTTCAACTA 






V_ X X XXX 






75 




TGATCAATTT 






X X XnwluXnn 


Af3 ATmr* att* 


A *3 ft 




TACAAATGAA 


AAATRACTTA 


ATTTATAflTV 




a 2VT2i PAT! 7A/"2 


' ■ " i *■ i " 1 T ap.r " i " 1 'f~* 
1X1 liiVlLi 11. 


A Q ft 




TGGGTACAAC 


TTACGTAGCA 


AAATAGCAGT 






/~»a «p/ip » n » qi 




20 


ACCAAAGCTG 


GTTTTCAAAG 


GTATAGATTT 


ACGTTCAAAT 




CATGATAALi 


" 600 




TACTGACACT 


AACTCTGTGT 


ATGAAATCGT 




A x\» 1 t-ATTAG 


PP»1\PP»P*P HAP 


' 660 




GATACGATCA 


TGTGCCATCA 


CAACGTAGTC 






PBPTRPRTPT 

CACTAGATCT 


720 


25 


AAAATAACTA 


GAGGCTAAGT 


AATTCTCCGC 


AA 1 Aft i A IVxA 


TGTTGCTCTG 


TAGGTGTATT 


780 




CGGTACCATT ACTACTTTCG TACCTTTTTC 


AAA 1 At- \- X 


TTACTATCAA 


ATACAACTTC 


840 


30 


ACCAACAGCT 


TCATGAACTA 


ATGACATTGG 


Inhl 1111 




TTTCATCTCT 


OPj Pi 


TCGACCTGTG 


TAATACCTTT 


GATCAGCTGC 


& P 21 A O. T* 2k f^l & C 


7i Afz*r 7i *r 7i 7i 71 r* 

r*-rMj 1 A 1 Hnftlj 


V» 1 1- x lA^uAl 


Q £T PJ 




GACATGATTA 


CCATAAATAT 


CAACATTATT 






Tf^nnT^cc a a r* 

X ^— VjVj 1 LjV_*AAI~ 


'i n "> pj 


35 


GAGTTGATAT 


ACTTGATTAA 


TCATCGGCAA 


TATCACCTTG 

X«%X WAvw X X \7 


AATAATHTirA 


111 VjV— X /^V— X 1 






TTAAATCATA 


CGGTGTTGTC 


ACTTTAATGT 


TGTATAGTTC 




AATTTAAi'^TYZ 
xm xxx iinv. x v7 






CAT^TCCAGA TTCGACAATG 


ATTTTACATG 


CATCTGATAA 


GATTTCTTTT 


TflTTP ACT A C 


1200 


40 


TTAAGGCGCG 


ATAACTATCT 


TGTAATAATT 


TAATATTAAA 


TGATTGTGGT 


GTTTGGCCTT ' 


1260 




G AT ACATTT C 


ATTCCTTACA 


GGGATACTGT 


GTATGTTCTG 


TTTATCTTTA 


GACATTACAA 


1320 




TCGTATCAAT 


TGCTTCAATG 


ACTGTATCTA 


CTGCACCATA 


TTTTGCTGCT 


ACTTCAATGT 


1380 


45 


TCTCTTTAAT 


AATACGTTGA 


GTTAAAAATG 


GTCTTACGGC 


ATCATGAGTT 


ACAATCACAT 


1440 




CATCATTATT 


AATTCCATTT 


ACATTGCGAA 


TATGGTCGAT 


AATGTTCATA 


ATTGTTTCAT 


1500 




TTCGATCCGT 


ACCACCTGCA 


ACTACTTTGA 


CACGTTGATC 


TGTAATGTTA 


TATTTTTTTA 


1560 


SO 


AAATATCCTG 


TGTATGGGAA 


ATCCACTGTG 


CTGGCGTTGC 


GATAATAATC 


TCATTAAATT 


1620 




CACTCACTAA 


AATGAACTTC 


TCAATTGTAT 


GGATTAAAAT 


CGGTTTATTA 


TCAATATCTA 


1680 



55 



439 



EP0 786 519 A2 



CTGCATAAAT CATGTTGTCC TCCATTCTGT CATTACATCA TTTCCATTTA TACATTACTG 1800 

ACCTATGCCC GCACATAAGC CTAACCTATT GCTCACTTGC CTCTTTTATT AATCCAAAGA i860 

5 

TAGTTGTCAC AATAGTGTGA TAATTTTTTA TAAAAATGTA TTTTTGTAAC TGACCATTCT 1920 

AAGTTGTTTT GCCATGCAGT TAATCATTAA CTCTGACGAT ATTAAATTGT TAAAGGTATT 1980 

AATGTTTACT CTTTTTCAAA TTCATTATTA CTGCCATCAT TTTACCATAT ATTATAATAA 2040 

10 

ATTTATCTTA TTAAGTGGCT GTACTTGATT TTCACTTTAA AAATTATCAA ATATTGCCAT 2100 

CTCATTTTAA GTATACAAAA TGCAAAACAA CCGATTCACA AGCATATTTC ACACAAGTAA 2160 

15 ACCGGCTATT TATCAACGTA TATTCGAAGA TGAATTATTT CGATAGTATC TATAGACCAG 2220 

ACGGCATTCG CACTTTCATA GCTATAACTA TACCAGCGTT TTCGTCCTCA AAGGTGCATA 2280 

CTAATAAATC GTAAACATGA CTTTATCAAA TCGTTCTTTC TTGTTAACTA ATTTA^CAAA 2340 

20 TGTCTCCGGG CCTTTTTCTA ACGGTAAAAA ATGAGAAATA ATAGGCTTTA CATTAATATC 2400 

TTTCGTCTTC ATATAATGTA AGGTTGCCGT CCACTCTTTG CCCGGAAAAT TACTGGACAA 2460 

ACAGTTCCAA GAGCCACATA CTGTCAACTC GTTACGCAGA ATTTTTTCAA AATGAACGCG 2520 

25 ATCAATCTCA ATATCATCAT ATGGTATTCC GAGTAATACC ACCTCGCCAC CTTTTTTAGG 2580 

TAGGGTCAAT ATTTGACCAA TCGTAACTTT AGCACCTGAT GATTCTATAG CTAAATCGAT 2640 

TTGATTGGCG TAATGATTTT CGATGAATTT CTCAAGATTT TCTTCTTTTG AATTGATTGT 2700 
30 • 

TTGATGTGCG CCCAATGATG TTGCZAATATC TAGTTTATGC GCATCTATAT CTATAGCGAT 2760 

GATATGTGCA GCACGAAATA TTCGTGCCCA TTGAATAGCT AACAAACCTA TACTGCCACA 2820 

CCCCATTACT GCAACAGTCA TACCAGGTTG TATATTCGAT TTATAAAACC CATGCGCAAC 2880 

35 

AAGGGCTGAT GGCTCAACCA TTGCTGCTTC AATGTAATCA ACATTGTCTG GAACCTTTAA 2940 

AACATTTTGC GCTGGCAATT TGACATATTC CGCGAACGAT CCAGGTTCAT ATGAGCCAAT 3000 

40 GACGAATAAC TTTTCACATC G TG CAT ATT C ACCTTTTAAA CAATACTCGC ATTGATAACA 30S0 

AGGTATTGCT GGGCAACCTG TCACTTTGTC GCCCACATTA ACATGCGTAA CATCACTTCC 3120 

AATGGCATCT ACTACACCTG AAAATTCATG ACCAAATGGC ATACCTTTAA TGTATGGCCC 3180 

45 CATTTTTTTG TATCGTGACG TGTCTGAACC ACATATGCCA GTCGCTCGTA CTTTAATAAT 3240 

AACGTCATTC GCACTTTCAA TGACTGGCTT TTCATTATCC TCATACCGTA AATCTTCCAC 3300 

GCCATATAAT TTCAATGCTT TCACTTGTAA ATCACCTCAA ATTTGATTTA ATTCACAACT 3360 

SO 

TTTTTCTTTT TAAAAATACC TGTCGCAAAA TAACCTGCAA TGACAATGGA ATTACTTACG 3420 

AGTAAATGTT C CAT AT AAAA ATCAGTGATT TGTCTTAATG GCCCAAGCAT AAAAGTTAGC 3480 
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TGCTTTAATA 


CCTTCGCCGG 


ATTTTAAATG 


TTGATACGCC 


TCGTCCCATT 


TCGAAATATC 


3600 




ATATATTTTT 


GTCACCAAAG 


CTTCAGCATT 


TACTAAACCA 


TCCGCCATAA 


GTTGCAATGA 


, , 3660 


5 


AGGTTCCCAA 


TCTGCTGGCT 


TTTGACTTCT 


ACTACCAACA 


ACTGTTATTT 


CTTTTTGAAT 


3720 




CACTTTTTCC 


ATATCAAATG 


GAATTTCAGC , 


, ATCCTTAAAA 


ATACCTATTT 


GACTGTAGAA 


3780 


10 


ACCTTTTTTG 


CGTAAAATAT 


CCAAACCTTG 


TCGTGCTGCT 


GGAACTGCAC 


CTGAACATTC 


3840 


AACAACAACA 


TCTGCACCGT 


AACCGTCTGT 


AATTCCATTG 


ATATACGTTT 


TTAAGTCTGT 


3900 




TTGTTGTAAA 


TTGACTA CAT 


AATCCATGTG 


CAATGCTTCT 


GCTTTATCTA 


ATCTGACTTT 


3960 


15 


GTCATTGTCC 


AATCCAGTTA 


CCACAACAGT 


TGCGCCTTTA 


CTTTTTAACA 


CTTGTGCTAC 


4020 




AAGTAATCCG 


ATTGGCCCAG 


GTCCCATTAC 


AACTGCTACA 


TCGCCTGAAT 


TGACTTGAAT 


4090 




CTTAGAAACG 


CCATGATGTG 


CACATGCTAA 


TGGTTCTGTC 


ATAGCTGCAG 


ACTGATACGA 


4 1 40 




TAtTCGTCTG 


GAATATGATG 


CAAACTTTCT 


TCACGTGCAA 


TGACATAATT 


AGTAAATGCG 


ionn 




CCATCAACTT 


GTGTTCCAAT 


ACCTTTTCGA 


TGGTTGCATA 


AATTATAGTC 


TTTTfJ A' F'l TA 


Tt ^ D L/ 




CAGTATTCAC 


ACTCATTACA 


AACATAGAAT 


GTCGTTTCAG 


aTG t GACACG 


GTCACCAACT 


** j •£ y 


25 


TTAAAATCTT 


TAACGTCTGC 


TCCAACTTCA 


• ACGATTTCAC 


CAGAAAATTC 


f \ x vjrnv>* >— l nn X 


43 80 




GTCACTGGAA 


AATTAACTTT 


ATAATGACCT 


TCATAAGTAT 


GAATATCTGT 




A 4 a n 




CCTGCATAAT 


GTACTTTAAT 


CTTTACTTTA 


TCATCTAGCG 


GTGTTGCAAC 


x x V— xxx .nx v^^rt. 




30 


AGAAGTTCTA 


AGTTGCCATG 


TCCTTCTCTT 


GTTTTTACTA 


AAGCTTTCAC 


CACAAACACC 


4560 




TCGATTTTTA 


ATTGAATAGA 


CTAAATAGTT 


TAAAGATAAG 


ATAGTTAACG 


ATATTACCAC 


4 620 


35 


CTTGATCAAT 


ACTTGAAATT 


TCAGATGAAC 


CTTTTGGCAT 


TTGTACATT C 


GTAC CTTTCG 


4680 


CCATATCTGT 


GAAAATGGGT 


GCTACGTCTG 


TTGCAAT AT A 


TAGTGAAATT 


GCAATCATAA 


4740 




TCGTACCCAC 


AATGACAGAA 


TGAATAATGT 


TTCCTCTTGC 


TGCACCAACA 


ATAAACGCGA 


4800 


40 


CAACAAATGG 


TATCGTTGCT 


AAGTCACCAA 


AAGGTAGTAC 


TTGGTTTCCT 


GGTAAAATAA 


4860 




CGGCTAATAA 


AACAGTGATA GGTACTAAAA TTAATGCTGT CGAAATAACT GCTGGATGAC 


, 4920 




CTAATGCTAC 


AGCCGCATCC 


AATCCAATAT 


AAATTTCACG 


TTCGCCAAAA 


CGTTTATTTA 


4980 


45 




TGCAGACTCT 


GAAACTGGCA 


TTAAACCTTC 


CATTAAGATT 


TTTACCATTC 


5040 




TAGGCATTAA 


TACCATTACT 


GCAGCCATTG 


ACATTCCTAA 


ATTAATGATG 


TCTCCAGGTT 


5100 




TGTAACCTGC 


TAACACACCA 


ATACCTAAAC 


CTAAAATTAA 


GCCGACAAAT 


ATAGACTCTC 


. 5160 


50 


CAAATGCGCC 


AAAACGTTTT 


TGAATTGTTT 


CAGGATCAGC 


ATCTAACTTA 


TTCAGACCGG 


5220 




GTACTTTTTG 


TAACAATTTA. ACTAAGTAAA TACCTGGTGC 


ATAAGAAATT 


GTACTTCCTG 


5280 
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CTACTTTCAA ACAGATAATT TGGAAAATAA CTGCTGCTAA TAACGCTTGC CAAATACTGC 54 00 

CTGATACGGC ATAAACCATT GCTGCTGTAA ACGTATAATG CCAAAAATTC CAAATATCTA 5460 

S 

CATTCATCGT CTTTGTCACT TTAGTTACTA GCAATACAAC GTTAACTATG ATTCCGAGTG 5520 

GAATAATAAA TGCTGCGACA GATGATGCCC AAGCGATAGA TGATGTTGCT GGCCAACCTA 5580 

CATCAATCAC ATTCAGACTG ACGCCTAAAT TTTTAACCAT CGCTTGTGCT GCTGGCCCTA 564 0 

10 

AATTTTTAAG TAATAAATCG ATGACTAAGA AAATCCCTAC AAAAGCCACA CCTATTGTTA 5700 

AACCAGACCT AAATGCGGCT CCAATTTTCT GCCTAAAGAA TAGGCCAAGC AAGAATATGA 5760 

1S CAACCGGTAA AATAACAGTt GCACCTAAAT CTAAAAATCC CCTTACAAAA TCAGTGAAGT 5820 

AACTCATATT TAAACCCTCC CTGTTATATA TGCATTGTCA CGATACTTTC CGATTGTGAT 5880 

TACATTTGAC GTTACAGTCA TTTCAACGAC AACCGTTGCT AAATTCGACT GCAGTCCTTT 594 0 

20 TGAATTACAG tCACTGCGTT TCTATGTCAT CAACAATCAT TTGTCGTGAT AGTCATTTAT 6000 

ATGCAATTTG CAT AT ATT AA TATGTTATCG ACCCACGTTA CATATCAATT CCGTTATTTT ' 6060 

TGTAACTCTG TTAAGATTTG TTGTTTTGTT TCTTCAATAC CAATACCAGT TAAGAAATTA 612 0 

25 CGTGCGTTGA TAACTGGGAA TTTATATTCT TTTTTTGTCA TTGCAGTTGT AACTAATAAA 618 0 

TCTGCAGTGT CTTCATAAGG TCCAACTTCT GTAATTTTGA TTTGTTTAAT ATCTACTTTA 624 0 

ATATTGTGTT CCTTTGCCAT TTCTT CAATT GGATTATTTA CTACTGTTGA CGTTGCAATA 63 00 

30 

CCTGCACCAC AGGCTACTAA TACTTGTTTC ATTTTCAATT C CT C CAATT A ATTT TTAGTT 636 0 

ATATTCCAAA TAATCATTGA TTAGTGTTGC TAAAATTGTT TCATCTTTCG TTCGTAGAAT 6420 

CTGCTC CAAT TTTTCTTCAC TTTGAAAAAT TTGCATCAAC TGTTGTAACA GCTTAAGTTG 6480 

35 

, ATCATCTACT TTATCCATTG CTAACATAAA AACGATTTTC ACTTCTGTCT GTTGATCAAG 654 0 

TGTTrCCATt TCAATAAACG GCACTTCTTT TTCTAGAACA GCCACACCTA TCGTTCTATG 6 600 

40 GTTAATATGT TCGACATCTG TATGCGGTAT AGCGACCGAA GATAGATGCG TTGGTAAACC 6660 

AGTAGCAAAT TCTTTTTCTG TGTCGATGAC TGGATCTTTA AACGTTGACT TCACGAACCC 6720 

ATTTTGAAAT AACACATCTG ACATTTGTGA CAATACGGAT TCTTTATCAG TTGGGGACAA 6780 

45 ATTGAGCATT ATATTTTCTT TATGGACTAA TTGCTGTCCC ATCCATTTTC CCTCGCTTCT 6840 

TTATTTGAAT AATTTTTTAA AATCTCATTT ACATCAGAAT TTTTGCGACT TTGTATGATG 6900 

CGCTTAATTG GGTCATTGTC TTGCGCCACA TCTCT CAATT GTAGTAACGC TCTTAAGTGT 6960 

SO 

GTCACTTTAT CAACAG CAGC AATAGGTACA ATAATATGGA TTGCTGTGCC ATCTGACATG 7020 

TATATTGGTT CTTGTAATAT CAACATACTC ATCGCTGTTT TATGTACATG CTTTT CAGAG 7080 
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TGCATCTCAT GAATATATTT AATATCAATA AAATGATTAG CAACTAACAC ATCACTTGCT 7200 
TTAGCAATAG CTTCATCAAT ATTTTCAACA TGATGCATTC TTTTCACGTG CCTTGCCGGT * 7260 

5 

ATCAAGTCAG CTAAATCTAA, TGyCTwATTT tGTGtGACaA TCGATCCATT AATGGTTGAA 7320 
ATTGAATTAT AATTGG CAAT AAAATCTTCT AAACCATCAC GTAGTcTGTA ATGTCATTAA 1 73 80 

CTGTCGTTGT GCGTTCAATT AATGCCATTA ACTTGTTTAT TTCCTTATCA ATGTCAGCCG 7440* 

10 

ATTCCTTATT AATGTACTTC ATCACTTCTT TACGTAACTT TCGTTGCTCA TTTTCAGATA 7500 

AAGCTACTTT TGTGATAAAT AATTTTTTAT GTGTTAGGAC AAACATTGGT GAAAAGACGA 7560 

1S TGTCATAATC TAATGTGTAA TTTTCAAATG TTCTAAGTGA AATCGCATCT AAGAAAATAA 7620 

TTTCTGGAAA TAAGTTTCGC AACTCGTATA ACATCATTTG TGATACTGAC GTGCCTTGTG 7680 

TACACACGAT AATAGCTTTT ATCTTGC CAT CGAAGTTTTC ATCTTGACGT CTCAAACTAC 7740 

20 CTCCGAACAA CATGGTTAAA TATGCTATTT CATTATCAGG GAACGATTTT CCGAAATATT 7800 

CAGTTAACGA TTGACATGAT TGTTTCACCA TATGAAATAA GGATTGATAA TTTCCTTGTA 7860 

AAGGATTTAT T AATT CAT CA CGATCCGTTA AGTTATATTT AATCCTATAA AAAGCAGGCG 7920 

25 

TTAAATGTAA CZAAGAGTTGC TGTGATAATT TCTCCTTATC TTCAATGTTA ATAAAAGTGA 7980 

TTTGTTGAAA ATGGTGAATC ATTTGAGCGA TGGCCATCGT . TAAATTCGAT. ATGCTATCTG 8040 

ATTCTTGCAA ATCAGTCCAT TGCACACTTG TTGAAAGTAA GTGTAATGTC AAATATAACT 8100 

30 

TTTCCGCTTC TGGCAAATCG GGCTCATGTT GCGTCAT AAT CTCCGTTGCT TGATATTCTT 8160 

TCGTATCGCT CAAATACTGA TAATTAATAT TT AATGG ATT * CATCACATGA CCACTTTGAA 8220 

TTCGTCTACG AATCACACAA AGGACATAAG GCAATGAACT AAGTGATTTG TCTATAAAGC 8280 

35 

GACTCTTCAA AAATTGTTCT ACCTGTTTGA TCTTGTCTTT TTGATATGCG ATATCTTCGA 8340 

ATGtTAAGTT GAGCGC CTTT AAAACTTCAC TTTTAGTAAT ATCATGATTC AACCTTTGAT 8400 

40 CAATCAACTT AATGAAGAAA CGGCGAACTT CAAATTCATC ACCAACAATT TCATAACCAT ' 8460 

GTTTTCGAGA ATACTTAAGT GACAAACCAT GATTTTCCAA TTGCTCTTTC ACATGATTTA 8520 

TATCGTGAAT GACAGTATTT TTACTGACTT GTAAATCAAT ; TGAAAAATGG - TTT AG AGACA 8580 

45 TTGCGTTTTC CTTACTAAAA AGCATGAGCA TTAAATAATA ACGACGTGTT TCTATGCTAA 8640 

AAATGACATT GTTGCCGTTT AACATTTGCT GCTCCGATAC ATCTCGCTTG AATAACGTCA 8700 

TGATTTCAGA ACTTACAATA AAATTTCCTT GGCTTGTTCT TTCAAGTTTT GGATAACCCT 8760 

50 

CTTGTTCAAG CCACAAATTG ATTTTTTGAA TGCGATATCC TAGTTGTCTA CGAGACAAAC 8820 

CAAATATCGA TTCAAGTTCT TTACCATGAA TAGTAGGATT CAATACAATT TCTCTGAGTA 8 880' 
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10 



15 



20 



25 



35 



45 



SO 



TCAATCGTCA CACCGATGTA CACACTTTGA ACACATATTT TCAAAATGAG CATGTACATC 9000 
ATTGTGATGT TTTAACAACA TTTCAATTAT ATCTATATTT TTTGTGATTT TAATCTTTTA 9060 
AAATAAAGCA ATTGAAATTT TTGCATATAT TTTTGTGTTT TGTGTTTTTT TGAAGCATTT 9120 
TTAACATACA TATCTCAATC ATTATCAAAT TGTCATGACC ATTGTAACCC AATACAAAAA 9180 
CCCTAAGGAC GCTTATATCA GGCGCCTTAG GGTTAACTGT ATCTATTTAA TTAAGTATTA 9240 
TTATTCGTAT GTACGTAACT TATGGTCTAT CAAGTTCCAC ACTTCTTCAA CATCAACTGC 9300 
TGTAGCAAAA TAAGCATTGG CAGGCTTACC TGTAACATGA TtTAAATCGA GAGCCATAGT 9360 
GCCATAAGTT AGTGGACTTT GATGTTCAAT GTCGATATTA ACGGGTACCA TTGTAAACAA 9420 
TTCTGGTTGT AACAAATACA AAATTGTACA AGCATCATGT ATTGGACCAC CATCCATATT 94 80 

AAAGTGAGTC TTGTATGTCT TCTTAAAGAA TTGCAATAAT TCTACGACGA ACTGTGCAAC 9540 
AGGATTATTG ATACTTTCAA AGCGTTCAAT CACGTGATCG TCGGCTAAAA CTTGATGTGT 9600 
TACATCTAAA CCAAACACAT TTATAGTAAT CCCACTTTCA AAAACACGCT TCGCTGCTTC 9660 
AGCATCTACC CAAATATTGA ATTCTGCTGT AGGCGTCCAA TTTCCAAATG TACCACCACC 9720 
CATCAAAGTA ATAGATTCAA TATGCTCAGC GATTCTTGGC TCACGAATCA ATGCCGTTGC 9780 
TACATTCGTA AGAGGACCTG TCGCTACAAT TGTTACAGGT GTATCACTCG TCATCACTTT 9840 
GTTTATAATC ACATCTGATG CTGGCATTGC AACTGCTTGA CGTGATGGTG TCGACGGTAG 9900 
TTTCGGACCA TCTAATCCAG ATTCCCCATG TATTTCAGAA GCAAAGGCAG CTGGTTTAAT 9960 

TAACGGCCTA TCCGCACCTT TCGCTACTGC TATATCTTGG CGTCCCATAA TATCCAATAC 10020 

GTTCAAGGCG TTTGTCGTAT TCTTGTCAAC TGATTGATTA CCTGCGACTG TTGTTACAGC 10080 

TAATATCTCT AGTGGACTGT CAATTGCCCC CGGTAAAATT AATGCTATTG CATCATCGTG 10140 ; 

TCCTDGATCA CAATCCATAA TAATCTTTCT TTTCATTTAT ATATCCACCT TTCTTAAGTT 10200^ 

GTTATCGATA GCTTATGTAT ATTTATTTAT GTGGTGAATC ATGTTTATTT TGAAAAATAG 10260 

TTTTAACTTT CTCATATTTT TGGATACAAA CACTATTTAT CTATTTTATG GCTTATAAAT 10320 

TTATCCGATA TGCCTTATCA ACCTACCTCG CTAAAAATAG GATGTCTACA TATCTATACC 103 80 

GACTTTTGTC AACTCATTTT CACAACAATA TAAACAGCAA TTTATATGAT TGTTACATGA 10440 

TTCAAACAAT TTTTATGAAA AATATTTTCA TACACAGAAT ATATATTGAT ATTAAATTTC 10500 

TCAAAAGCTA TATTGAGAAT AATTAGGAGG GATGTTGATG AAATCTTTAT TTGAAAAAGC 10560 

ACAGCAGTTC GGCAAGTCCT TTATGTTACC TATCGCAATC TTACCAGCTG CAGGTCTATT 10620 

GTTGGGTATC GGTGGTGCAT TAAGTAATCC AAACACCGTT AAAGCATACC CTATTTTAGA 10680 
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AAATTTACCG 


GTCATCTTTG 


CAATTGGTGT 


CGCAATCGGA 


T*r ZiTPT Zt n A a 




lvoUU 




TACTGCAGGT 


tTAGctGCGC 


TGCTCGGTTT 


CTTAATTATG 


/U-W,VjV^/\/\V. XA 




iUooU 


5 


ATTAACTATC ACGGGCACAT TGGCAAAAGA TCAGCTTGCA 


r* zv zt zl & Tfzn & 


A A/-^/-«/-» A <T , ^2/~ , T , 
AnuuLn X X 






GCTCGGTATA 


CAAACGGTTG 


AAACCGGTGT 


TTTTGGCGGG 


nil r\ 1 ^—/^V r\\j 


X J\ X X A 1 unL 




10 


CGCAATACTT 


CACAACAAAT 


ATCACAAAGT 


GGTATTACCA 


r' A * I " J f LJ T ^ A 

LLulnl X IAVj 


GTTTCTTTGG 


1104 0 


TGGCTCTAGA 


TTTGTCCCTA 


TTGTCACAGC 


ATTTGCCGCA 


ATCXTTTTAG 


prrv^Tn TTP J\ T 

\j 1 vj I A I rijAl 


11100 




GTTTTTCATT 


TGGCCAAGCA 


TACAAGCCGG 


CATTTATCAT 


GTTGGTGGAT 


TTGTAACGAA 


11160 


15 


AACAGGTGCC 


ATCGGTACTT 


TTGTTTATGG 


CTTCATCTTA 


A^tAMWiy "1"!' A r* 

AGATTGTTAG 


GTCCACTCGG 


11220 




TTTACACCAT 


A'xTUTTl'ACT 


TACCGTTTTG 


GCAGACGGCA 


CTTGGTGGTA 


CTTTAGAAGT 


11280 




CAAAGGGCAC 


TTAGTTCAAG 


GTACGCAGAA 


CATCTTCTTT 


GCTGAACTTG 


GTGATCCAGA 


y 11340 


20 


TGTGACGAAG 


TATTATTCAG 


GTGTGTCACG 


CTTTATGTCA 


GGCCGTTTTA 


TTACGATGAT 


11400 




GTTCGGCTTA 


TGTGGTGCCG 


CACTTGCAAT 


TTATCACACA 


GCTAAACCTG 


AACATAAAAA 


11460 




AGTTGTCGGC 


GGTTTAATGT 


TATCCGCTGC 


ACTCACTTCA 


TTTITAACAG 


GTATTACCGA 


11520 


25 


ACCTTTAGAG 


TTTAGTTTCT 


TGTTTGTCGC 


ACCTATTCTT 


• TATGTAATCC 


ATGCCTTCTT 


11580 




TGATGGATTA 


GCATTTATGA 


TGGCAGACAT 


TTTCAACATT 


ACAATTGGTC 


AAACCTTCAG 


11640 




TGGAGGCTTT 


ATCGATTTCT 


TACTCTTTGG 


TGTGCTACAA 


GGTAATAGTA 


AAACAAACTA 


11700 


30 


CCTATACGTC 


ATACCTATTG 


GAATTGTGTG 


GTTCTGTTTG 


TATTACATCG 


TTTTCAGATT 


11760 




CTTAATTACG 


AAATTTAATT 


TCAAAACACC 


TGGTCGAGAA 


GATAAAGCTG 


CAGCACAACA 


11820 


35 


AGTTGAGGCT 


ACTGAAAGAG 


CACAAACTAT TGTTGCTGGT 


TTGGGAGGCA 


ft ft: ft m % » n 

AAGATAACAT . 


11880 


TGAAATCGTT 


GACTGTTGTG 


CAACGAGACT 


ACGCGTCACA 


CTTCATCAAA • 


A ft V * A AAA 

ATGACAAAGT 


11940 




CGATAAAGTA 


TTACTCGAAA 


GTACTGGTGC 


CAAAGGTGTA 




GCAC7TGGTGT . 


12000 


40 


RrAftflTAATT 

Ov-rirtO X ■t\r\. X X 




ACGTTACAGT 


TATCAAAAAT 


VjAAAX IbAAb 


AATTGCTCGG 


12060 




GGATTAAGAC 


TAACCGAAAT 


ATCAACAGAA 


CTAATGGCAA 


f"Y2 & TYtT* 2k Of! 21 
^-oA 1 vj 1 ,rt\»kjA 


A^t X AAts AA*j I 


1212 0 




GACATCGTTG 


CTTTTATTTT 


TAATGTTACA 


TTTGAAGCAT 




AXvjI—HA_ HjI A 


1* lo U 


45 


GTGAGCCCGC 


AAATCGCCTC 


TGCTAGACAA 


TCATCTTAAT 


GCTATGATTA 


AAGCTTAAGT 


12240 




GCCAGATTTG 


AATTTAATTT 


CAACAACGAC 


TTTCACTACA 


TTAAAAATAG 


GGCCACTCGA 


12300 




CACATATAGT 


TGTATCAAAT 


AGCCCTTTAT 


ACAATTTTTT 


GGGTAAGGTT 


TTACAATTTT 


12360 


50 


TGGGATGGTA 


TAGATTTTAT 


AAAAAGTTAT 


TTAAGTTCTT 


CTGCTTCAGC 


CATAATATCT 


12420 




TTTAATGTTT 


TAGCTGAATG 


TGCGAACTTG 


CTTTGTTCTT 


CGTCGTTTAA 


TGGGATTTCT 


12480 



55 



445 



BP 0 786 519 A2 



10 



15 



25 



30 



35 



40 



45 



SO 



TCCTCATATT CGCCTTCTAA TAATGCTGAT ACAGTCAATA CGGCATCTTC ATTTCTGAAA 12600 

ATCGCTT CAG TAATTCTAGC TAATGCCATT GCAACACCAT AATAAGTGGC ACCTTTAGCT 12660 

TGAATAATGT CATATGCTGC ATCACGTGTT TGAACAAAAA TTTGTTCAAT TTGCGCTTTG 12720 

CCCTCAGGAC GTTGTTCAAG TAATGTCTTC AAAGGTTGAC CCGCAATATT AGCGTGTGAC 12780 

CATACTGGTA ATTCAGTGTC ACCATGTTCA CCAATAATTT GAGCATCGAC GCTAGGTGGC 12840 

GCAACATCGn AcgyTcGCTT AACAATAATC TAAAGCGTGC AGAGTCTAAA ATTGTACCAG 12900 

AACCTATAAC ACGTTCTTTA GGTAAACCAG AGAATTTCCA TGTTGCATAC GCTAAAATAT 12960 

CAACAGGATT TGTAGCTACC AAGAAAATAC CATCAAATTT TGATGCCATT ACTTCACCAA 13020 

CAATTGATTT GAATATTTTC AAGTTTTTAG ATACTAAATC TAAACGTGTT TCTCCAGGTT 13080 

TTTGTGCAGC ACCAGCACAG ATGACAACTA GATCCGCATC ATGACAATCA CTGTATTCGC 13140 

CAGCTTTCAC ACGAACTGTT GTTGGAGAAT ATGGTGTGGC ATGTTTTAAA TCCATAACAT 13200 

CTCCTCGAAC TTTTTCAGTG TCTAAATCAA TGATGACTAA TTCATCAACA ATGCTTTGGT 13260 

TCACTAATGA AAATGCGTAG CTTGAACCTA CTGCAGCATT ACCTATTAAT ACAACTTTGT 13320 

TCCCTTTAAA TTTGTTCATT ACAAAAACTC CCTTATGATT AATTCACTAA CATACATGTA 13380 

GCTTCAAATA TGTTAGTTTA ATGCTGCTTA TTGACGATAC AAAAGCAAAT AAACATCTCT 13440 ": 

TTTATTTTCA ACGCATAACT TAAAAGGTCA TGTGTCATCC GCTTTTAAGT TTGTGATTTA 13500 

TTTCACATAT AAAATGTAAC ATGCATTAAG TACTGGGTCA ATATTAAATT GTGATTTATT 13560 

TCACATTTTA TTTTAATTTT TACACCTTTT TAATTTGTAT mCGATTACAT CTTAGATGTC 13620 ' 

TTTAGTCTTC GTACTTCGCC AGTGATTATT TACACTTTCA CATTTTTATT ATCATGTTTA 13680 '"" 

CTTTTTTCTA GGAAAACAAC AATGTTTTTT GAATTAGTCA AATAAATGCG CTCAATCGTC 13740 

GGTGTGCAAA CAGACAATTG TACACAATGC TTATTGATAA GTATTTAAAA AATTAAAAAT 13800 " 

GTCATACAAT TATGAAATTT GCCATTTTAT TTATATTTTC TCAAACCAAT TAATTGAATA 13860 

TCGAAATTTT TAGTAGAATA ATCAAAATAT ACAGATTAAA GGAGGAGTAT CATGCTTACA 13920 

GAAGAAGAGA AAGACATTAT CAAACAAACG GTGCCTTTAC TTAAAGAGAA AGGGACAGAA 13980 

ATTACGTCAA TCTTTTATCC AAAAATGTTT AAAGCGCATC CTGAACTTTT AAACATGTTT 14040 

AATCAAACGA ACCAAAAACG AGGCATGCAA TCTTCAGCAT TAGCACAAGC TGTAATGGCC 14100 

GCAGCGGTTA ATATCGATAA CTTAAGTGTT ATTAAACCAG TCATTATGCC AGTCGCATAT 14160 

AAACACTGGG CACTACAAGT TTATGCTGAA CATTATCCAA TTGTGGGGAA AAATTTATTA 14220 

AAAGCCATTG AAGACGTGAC AGGATTAGAA GAAAATGAGC CTGtCATTCA AGCTTGGGCA 14280 



55 



446 



EP 0 786 519 A2 



(2) INFORMATION FOR SEQ ID NO: 58: 

<i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 8779 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
<D) TOPOLOGY: linear 

10 - 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: - 

GGTATTTTnG GAnGGGTACC TAAAGCAATT CCGGCAAAGG GTnAATCCAG GTACCGAAAT 60 

1S GGACTTCCCG TTATCGATAA TACCGACATA TATTGTGACA AGTAGATTTT ATGGACATTT 120 

AGGCTTACTT TTACTTGTGA TAATTGCATG TATGTTTACT GGTATTTAtC CaTCaATACA 18 0 

TATCATTCAA TTATTGATAT ATGTACCGTT TTGTTTTTTC TTAACTGCCt CGGTGACGTT 240 

20 ATTAACATCA ACACTCGGTG TGTTAGTTAG AGATACACAA ATGTTAATGC AAGCAATATT 3 00 

AAGAATATTA TTTTACTTTT CACCAATTTT GTGGCTACCA AAGAACCATG GTATCAGTGG 360 

TTTAATTCAT GAAATGATGA AATATAATCC AGTTTACTTT ATTGCTGAAT CATACCGTGC 420 

25 

AGCAATTTTA TATCACGAAT GGTATTTCAT GGATCATTGG AAATTAATGT TATACAATTT 4 80 

CGGTATTGTT GCCATTTTCT TTGCAATTGG TGCGTACTTA CACATGAAAT ATAGAGATCA 540 

ATTTGCAGAC TTCTTGTAAT ATATTTATAT GACGAAACCC CGCTAACCAT TAATAAATGG 600 

30 

AAGTGGGGTT CATTTTTGTT TATAATTTAA GTAAATAACA TATTAAGTTG GTGTATTATG 6 60 

AACGTTTTAA TAAAGAAATT TTATCATTTG GTAGTTCGAA TACTTTCTAA AATGATTACG 720 

^ CCTCAAGTGA TTGATAAACC GCATATCGTA TTTATGATGA CTTTTCCAGA AGATATTAAG 780 

CCTATCATCA AAGCATTAAA TAATTCGTCG TATCAGAAAA CTGTTTTAAC AACACCAAAA 840 

CAAC2CGCCTT ATTTATCTGA ACTTAGCGAC GATGTTGATG TGATAGAAAT GACTAATCGA 900 

40 ACATTGGTAA AACAAATTAA GGCTTTGAAA AGCGCGCAGA TGATTATTAT CGATAATTAT 960 

TACCTATTGC TAGGTGGATA TAATAAGACT TCTAATCAAC ACATTGTTCA AACGTGGCAT 1020 

GCAAGTGGTG CATTAAAAAA CTTTGGCTTA ACAGATCATC AAGTCGATGT GTCTGACAAG 1080 

45 GCAATGGTTC AGCAGTACCG TAAAGTTTAT CAAGCGACGG ATTTTTACTT AGTGGGTTGT 1140 

GAACAAATGT CACAATGTTT TAAACAGTCT TTAGGTGCAA CAGAAGAGCA AATGCTGTAT 1200 

TTTGGGCTTC CGAGAATTAA TAAATATTAC ACAGCTGATA GAGCAACGGT TAAGGCAGAG 1260 

SO 

TTAAAGGATA AATATGGAAT TACAAATAAG TTGGTATTAT ATGTACCAAC ATATAGAGAA 1320 

GATAAAGCAG ATAATAGGGC T ATTGAT AAA GCTTATTTTG AAAAATGTTT ACCAGGATAT 13 80 
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ATCGACACGT CTACATTAAT GCTAATGTCA GATATAATTA TTAGCGACTA TAGTTCGCTG 1500 

CCAATAGAAG CTAGCTTGTT AGATATTCGA ACTATATTTT ATGTGTATGA TGAAGGAACA 1560 

TATGATCAGG TGAGAGGCCT GAATCAATTT TACAAAGCAA TACCGGATAG CTACAAAGTG 1620 

TATACTGAAG AAGATTTAAT AATGACGATA CAAGAAAAAG AACATCTATT AAGTCCGTTA 16 80 

TTTAAAGATT GGCATAAGTA TAATAGTGAT AAAAGTTTAC ATCAGCTCAC AGAATATATA 174 0 

GATAAGATGG TGACAAAATG AGGTTTACGA TAATCATACC TACATGTAAT AATGAGGCAA 1800 

CAATTCGACA ATTGTTAATA TCTATTGAGA GTAAAGAACA CTATAGAATC CTTTGTATTG 1860 

ATGGTGGTTC TACTGATCAA ACAATTCCTA TGATTGAACG GTTACAAAGA GAACTCAAGC 1920 

ATATTTCATT AATACAATTA CAAAATGCTT CGATAGCTAC GTGTATTAAT AAAGGTTTGA 1980 

TGGATATCAA AATGACAGAT CCACATGATA GTGACGCATT TATGGTCATA AAACCAACAT 2040 

CAATCGTATT GCCAGGTAAA TTAGATAGGT TAACTGCTGC TTTCAAAAAT AATGATAATA 2100 

TTGATATGGT AATAGGGCAG CGAGCTTACA ATTACCATGG TGAATGGAAA TTGAAAAGTG 2160 

CTGATGAGTT TATTAAAGAC AATCGAATCG TTACATTAAC GGAACAACCA G ATTTGTT AT 2220 

25 CAATGATGTC TTTTGACGGA AAGTTATTCA GTGCTAAATT TGCTGAATTA CAGTGTGaCG 22 80 

AAACTTTAGC TAACaCATAC AATCACGCAA TACTTGTCAA GGCGATGCAA AAAGCTAGGG 2340 

ATAT ACATTT AGTTTCACAG ATGATTGTCG GAGATAACGA TATAGATACA CATGCTACAA 24 00 

30 GTAACGATGA AGATTTTAAT AGATATATCA CAGAAATTAT GAAAATAAGA CAACGAGTCA 24 60 

TGGAAATGTT ACTATTACCT GAACAAAGGC TATTATAT AG TGATATGGTT GATCGTATTT 2520 

TATTCAATAA TTCATTAAAA TATTATATGA ACGAACACCC AG CAGTAACG CACACGACAA 2580 

-TTCAACTCGT AAAAGACTAT ATTATGTCTA TGCAGCATTC TGATTATGTA TCGCAAAACA 2640 

TGTTJGACAT TATAAATACA GTTGAATTTA TTGGTGAGAA TTGGGATAGA GAAATATACG 2700 

AATTGTGGCG ACAAACATTA ATTCAAGTGG GCATTAATAG GC CGACTT AT AAAAAATTCT 2760 

TGATACAACT TAAAGGGAGA AAGTTTGCAC ATCGAACAAA AT CAATGTT A AAACGATAAC 2820 

GTGTACATTG ATGACCATAA ACTGCAATCC TATGATGTGA CAATATGAGG AGGATAACTT 2880 

45 AATGAAACGT GTAATAACAT ATGGCACATA TGAGTTACTT CACTATGGTC ATATCGAATT 2940 

GCTTCGTCGT GCAAGAGAGA TGGGCGATTA TTTAATAGTA GCATTATCAA CAGATGAATT 30 00 

TAATCAAATT AAACATAAAA AATCTTATTA TGATTATGAA CAACGAAAAA TGATGCTTGa 3060 

SO ATCAATACGC TATGTCGATT TAGTCATTCC AGAAAAGGGC TGGGGACAAA AAGAAGACGA 3120 

TGTCGAAAAA TTTGATGTAG ATGTTTTTGT TATGGGACAT GACTGGGAAG GTGAATTCGA 3180 
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TAAAATCAAA 


■ CAAGAATTAT 


' ATGGTAAAGA 


TGCTAAATAA 


ATTATATAGA 


. ACTATCGATA 


330.0 




CTAAACGATA 


AATTAACTTA 


GGTTATTATA 


AAATAAATAT 


AAAACGGACA 


AGTTTCGCAG 


3360 


5 


CTTTATAATG 


TGCAACTTGT CCGTTTTTAG 


TATGTTTTAT 


TTTCTTTTTC 


TAAATAAACG 


3420 




ATTGATTATC 


ATATGAACAA 


TAAGTGCTAA 


TCCAGCGACA 


AGGCATGTAC 


CACCAATGAT 


3480 




AGTGAATAAT 


GGATGTTCTT 


CCCACATACT 


TTTAGCAACA 


GTATTTGCCT 


TTTGAATAAT 


3 54 0 


10 


TGGCTGATGA 


ACTTCTACAG 


TTGGAGGTCC 


ATAATCTTTA 


TTAATAAATT 


CTCTTGGATA 


3600 




GTCCGCGTGT 


ACTTTACCAT 


CTTCGACTAC 


AAGTTTATAA 


TCTTTTTTAC 


TAAAATCACT 


3660 


15 


TGGTAAAACA 


TCGTAAAGAT 


CATTTTCAAC 


ATAATATTTC 


TTACCATTTA 


TCCTTTGCTC 


3720 


ACCTTTAGAC 


AATATTTTTA 


CATATTTATA 


CTGATCAAAT 


GAGCGTTCCA 


TTAATGCATT 


3780 




CCCCATCATA 


TTAOGTTGCT 


TCTCGCCACC 


AAGGTTTTTA 


TAGTCTCCTG 


CACCCATGAT 


3840 




AACTTGATTA 


ATTCTAAATT 


TACCTCGTTT 


GGTAGTAATC 


GTATGGTTGT 


AATTTGCTGT 


3900 




ATCACTTGAT 


CCAGTTTTTA 


AACCATCTGT 


ACCCGGCAAA 


CTCATTTTTG 


CACCTTCCAA 


3960 




TGAAAAGTTG 


AATGTGTAAT 


ACGTAACTGC 


ATGCGTTGTT 


GGTG CTAACT 


GCTTTGTAAA 


4020 


25 


GTCTAATATT 


TTAGGTGTCT 


CTTTAATCAC 


GTGTAAATCT 


AAAATGGCAT 


AGTCTCTAGC 


4080 




AGTCGTTACA 


GTACGTTCTT 


GGTCTTTATA 


CTTTGTTGGT 


GCAAATGTAC 


GTAATCTTGA 


4140 




ATTTTCAGCA 


CCCGTTGGAT 


TGACGAAATG 


TGTATTTTTC 


ATTCCGATAG 


CTTTAGCTTT 


4200 


30 


GTTATTCATT 


AAATCAACGA 


AATCGCTGGT 


GTTTTTTGAA. 


ACCTTCTTAG 


CTAAAATTAA . 


4260 




TGCCGCGGCA 


TTACTAGAAT 


TAG ATAGTGT 


AATTTGTAAT 


AGGTCTGCGA 


TTGTCCATAC 


4320 




TTGTCCAGGA 


TATAGTTTCG 


TATTACTCAA 


CTCAGGTAGT 


GTAGACATAA 


TATATTCTTT 


4380 


35 


GTTCGTCATT 


GTGACTGTGT 


CATCAAGTGA 


AAGCTGCCCC 


TTATTTACAG 


CTTC CAATGT 


4440 




TAAGTACATT 


GTCATTAATT 


TAGTCATAGA 


CGCTGGAtTC 


CACTTAGTAT 


CGATATTGTA 


4500 


40 


TTGATACAGT 


AATTGTCCAG 


TTTGACTTAC 


ATTAACAGCA 


CTCGTCGGTT 


CGTATGCAGC 


4560 


CGACAAACCT 


GCATAACCAT 


ATTGATTTGC 


TGCTTGTACA 


GGGGTTACGT 


CACTGTTAGT 


4620 




AGCTTGTGCA 


TATGGTGTCA 


TAATACTTAA 


TGTTAAACAT 


AAAATGATGA 


TAATAGATAT 


4680 


45 


TAAATTTTTC 


ATAAAGCGTT 


AATCTTGCCT 


TTTCCAATTC 


TTAAATATTC 


C CT A A A A fir* A 






ATGGTTATTC 


CTACTTACGG 


AAATCATTGC 


TAATTCACTT 


CACCTTAATT 


AAATTGTTGA 


4800 




AAATAAAGTT 


TTCTGCAGTT 


AATTTGAAAA ATAATGCAAA 


TATATTACGT 


GTGTAGCTAA 


4660 


SO 


AGGTGTTATA 


ATGTTTGTAC 


GAAGAGCAAA 


CTTACTCAAA 


AGCGATTAAT 


TTTCATGTTT 


4920 




TAATATAAAG 


ACTTTGAGAA 


GTTATTACAA 


AAAATGCAAT 


AGAAATATTC 


TATCATATAA 


4980 
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AAGTATATGA 


TAGAAATGCA 


TGTATCTATC 


TAAATGAATT 


AACTATAAAT 


TTCAAACAGA 


5100 




AGAGGTAAAA 


CTATGAAACG 


AGAAAATCCA 


TTGTTTTTCT 


TATTTAAAAA 


ACTATCATGG 


5160 


5 


CCAGTGGGTC 


TTATCGTTGC 


AGCTATGACT 


ATTTCATCAC 


TAGGGAGCTT 


AAGTGGACTA 


5220 




TTAGTGCCAC 


TGTTTACTGG ACGAATTGTA GATAAATTTT CCgTGAGCCA TATCAATTGG 


5280 


10 


AATCtAATCG 


CATTATTTGG 


TGGTATCTTT 


GTCATCAATG 


CTTTATTAAG 


CGGATTAGGT 


5340 


TTATATTTAT 


TAAGTAAAAT 


TGGTGAAAAG 


ATTATTTATG 


CGATACGCTC 


AGTTTTATGG 


5400 




GAG CATATCA 


TACAATTAAA 


AATGCCATTC 


TTTGACAAAA 


ATGAAAGTGG 


TCAATTAATG 


5460 


15 


AGTCGATTAA 


CTGACGATAC 


GAAAGTGATA 


AATGAATTTA 


TTTCACAAAA 


GCTACCTmAC 


5520 




TT ATT A C CAT 


CAATCGTTAC 


ATtAGTTGGG 


TCACTAATCA 


TGTTATTTAT 


TTTAGATTGG 


5580 




AAAATGACAT 


TATTAACATT 


TATAACGATA 


CCGATATTCG 


TTTTaATTAT 


GATTCCTCTA 




20 


GGTCGTATTA 


TGCAAAAGAT 


ATCGACAAGT 


ACACAATCTG 


AAATTGCAAA 


CTTCAGTGGT 






TTWTT Afl^TSC " 
x x vj x x n\j\3\jv< 


GTGTCCTAAC 


TGAAATGCGT 


CTTGTTAAAA 


TATCAAATAC 


AGAGCGTCTT 






GAATTAGATA 


ATGCACATAA 


AAATTTGAAT 


GAAATATATA 


AATTAGGTTT 


AAAACAGGCT 




25 


AAAATTGCGG 


CAGTTGTACA 


ACCAATTTCA 


GGTATAGTTA 


TGTTGCTAAC 


AATTGCAATT 


coon 




^» A A A A /IsTVJ X X 


TTGGTGCATT 


AGAAATTGCG 


ACTGGTGCAA 


TCACTGCAGG 


TACATTAATT 






GCAATG A T AT 


TTTATGTTAT 


TCAGTTATCT 


ATGCCTTTAA 


TCAATCTTTC 


CACGTTAGTT 


ouuu 


30 


ACAGATTATA 


AAAAGGCAGT 


CGGTGCAAGT 


AGTAGAATAT 


ACGAAATCAT 


GCAAGAACCT 


DvOU 




ATTGAACCGA 


CAGAAGCTCT 


TGAAGATTCT 


GAAAATGTAT 


TAATTGATGA 


CGGTGTATTG 


6120 


35 


TCATTTGAAC 


ATGTAGACTT 


TAAATATGAT 


GTGAAGAAAA 


TATTAGATGA 


TGTGTCGTTC 


6180 


CAAATCCCAC 


AAGGTCAAGT 


GAGTGCTTTT 


GTAGGCCCTT 


CTGGGTCTGG 


TAAAAGTACG 


6240 




ATATTTAATC 


TGATAGAACG 


TATGTATGAA 


ATTGAGT CAG 


GTGATATTAA 
w x vj*» x x x nn 


± J± X VJVJ^w X X 


6300 


40 


GAAAGTGTCT 


ATGATATCCC 


GTTATCTAAG 


TGGCGACGCA 


AAATTGGATA 


TGTTATGCAA 


6360 




TCAAATTCGA 


TGATGAGTGG 


TACAATTAGA 


GACAATATTT 


TATACGGAAT 


TAATCGTCAT 


6420 




GTTTCAGATG 


AAGAACTTAT 


TAATTATGCT 


AAATT AG CG A 


ACTGTCATGA 


TTTTATCATG 


6480 


45 


CAATTTGATG 


AAGGATATGA 


CACGCTTGTA 


GGTGAACGAG 


GATTGAAACT 


GTCTGGCGGA 


6540 




CAACGTCAAC 


GTATTGATAT 


TGCTAGAAGT 


TTTGTTAAAA 


ATCCTGATAT 


TTTGTTACTT 


6600 




GATGAAGCAA 


CAGCTAATCT 


CGATAGTGAA 


AGTGAATTGA 


AAATT CAAGA 


AGCTTTAGAA 


6660 


SO 


ACATTGATGG 


AAGGTAGAAC 


AACGATTGTC 


ATTG CGCATC 


GTTTGTCTAC 


AATTAAAAAA 


6720 




GCCGGTCAAA 


TTATATTCTT 


AGACAAAGGA 


CAGGTAACAG 


GTAAAGGTAC 


G CATTCAGAA 


6780 
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TTTTATATAT ATAAGTAAGC TTGGAGCAAA TACACATATA CCATCGAGGA AATTAAAGTG 6 900 

TGGCACATTG ATGGATATAG ATGTTAATAA ATTGCTTCAA GCTTTTGTCT ATTTTAAATC €960 

5 ATTTGAGAAG TTACGACATA ATAATTCTTA AATTAATGAA ATCGATATTT TAAGAAAAAA 7020 

ATGCTCATGG TATAATACAA GTTATAAGCA AACATACATA TATTAAATAC TGTAGCCACG 7080 

AGTCATAATT CTTCATATTT TACATAGCAA TTTAACTGAT TTTAGAGTCC ACGGTACAGA 714 0 

10 AGTTTGATAT TTCAATGTTT CTAAATTTTT AAAAAATTAA ATCATAGGTG GGTGCCAAAT 7200 

GTTTTTATTA ATCAACATTA TTGGTCTAAT TGTATTTCTT GGTATTGCGG TATTATTTTC 7260 

AAGAGATCGC AAAAATATCC AATGGCAATC AATTGGGATC TTAGTTGTTT TAAACCTGTT 7320 

15 

TTTAGCATGG TTCTTTATTT ATTTTGATTG GGGTCAAAAA GCAGTAAGAG GAGCAGCCAA 7380 

TGGTATCGCT TGGGTAGTTC AGTCAGCGCA TGCTGGTACA GGTTTTGCAT TTGCAAGTTT 7440 

GACAAATGTT AAAATGATGG ATATGGCTGT TGCAGCCTTA TTCCCAATAT TATTAATAGT 7500 

20 

GCCATTATTT GATATCTTAA TGTACTTTAA TATTTTACCG AAAATTATTG GAGGTATTGG 7560 

TTGGTTACTA GCTAAAGTAA CAAGACAACC TAAATTCGAG TCATTCTTTG GGATAGAAAT 7620 

2S GATGTTCTTA GGAAATACTG AAGCATTAGC CGTATCAAGT GAGCAACTAA AACGTATGAA 7680 

TGAAATG CGT GTATTAACAA TCGCAATGAT GTCAATGAGC TCTGTATCGG GAGCTATTGT 774 0 

AGGTG CGTAT GTACAAATGG TACCAGGAGA ACTGGTACTA , ACGGCAATTC CACTAAATAT . 7800 

30 CGTTAACGCG ATTATTGTGT CATGCTTGTT GAATCCAGTA AGTGTTGAAG AGAAAGAAGA 7 86 0 

TATTATTTAC AGTCTTAAAA ACAATGAAGT TGAACGTCAA . CCATTCTTCT CATTCCTTGG 7920 

AGATTCTGTA TTAGCAGCAG GTAAATTAGT ATTAATCATC ATCGCATTTG TTATTAGTTT 7 980 

55 TGTAGCGTTA GCTGATCTAT TTGATCGTTT TATCAATTTG , ATTACAGGAT TGATAGCAGG 804 0 

ATGGATAGGC ATAAAAGGTA GTTTCGGTTT AAACCAAATT TTAGGTGTGT TTATGTATCC 8100 

ATTTGCGCTA TTACTCGGTT TACCTTATGA TGAAGCGTGG TTGGTAG C AC AACAAATGGC 8160 

40 

TAAGAAAATT GTTACAAATG AATTTGTTGT TATGGGTGAA ATTTCTAAAG ATATTGCATC 8220 

TTATACACCA CACCATCGTG CGGTTATTAC AACATTCTTA ATTTCATTTG CAAACTTCTC . 82 8,0 

AACGATTGGT ATGATTATCG GTACATTGAA AGGCATTGTT GATAAAAAGA CATCAGACTT 8340 

45 

TGTATCTAAA TATGTACCTA TGATGCTATT ATCAGGTATC CTAGTTTCAT TATTAACAGC 8400 

AGCTTTCGTT GGTTTATTTG CATGGTAATA TGTCGAAGAG TGACTATGAT AATACATTTT 8460 

SO AACTAATAAA TATGTCCAGG CATGTCGTCT ATTGATATAG GTGAGATGCT TGGACTTTTT . 8520 

TATTATTGAT ATAAAGGTAT nTAAATATTT TTAAAGTTAC CGAAATTGAA GCATTATAAA 8580 
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GACAGTAAGG ACTAGGTACA GTCATAGTAC TTCGAGCAAA ATTTGTTTTG TTATTATAAA 8700 

CAACACAAAG GAGATAACTT CTCTAnTGAA GAAGTTAAAA ACATTATAGC AGACAATGAA 8760 

5 ATGAAAGTAA ATTAAAAAT 8779 

(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 31096 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 

is 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 

GTTGCAGTAG TCAAAGAATT AAACAAGGTG AAGGcGTGTA GCTTGCACAC CCGAAAATGT 60 

20 GCGTAAGTTA a CGGATGCAG GACATAAAGT AATTGTTGAA AAAAATGCTG GCATTGGTTC 120 

AGGATTTTCT AACGATATGT ATGAAAAAGA AGGCGCTAAG ATCGTAACTC ACGAACAAGC 180 

ATGGGAAGCT GATCTTGTTA TCAAAGTAAA AGAACCTCAT GAAAGCGAAT ATCAATATTT 240 

25 CAAAAAGAAT CAAATTATCT GGGGATTTTT ACATCTAGCA TCTTCAAAAG AAATAGTAGA 300 

AAAAATGCAA GAAGTTGGTG TAACTGCGAT TAGTGGTGAA ACCATTATAA AAAATGGAAA 360 

AGCAGAATTA TTAGCGCCAA TGAGTGCTAT AGCAGGTCAA CGCTCAGCAA TTATGGGAGC 420 

30 ' - 

TTACTACTCT GAAGCACAAC ATGGTGGTCA AGGTACTTTA GTGACTGGTG TACATGAAAA 4 80 

TGTGGATATA CCTGGTAGTA CATATGTGAT TTTCGGTGGT GGAGTAGCAG CAACAAATGC 540 

AGCAAATGTT GCCTTGGGAC TAAATGCTAA AGTAATCATT ATCGAGTTAA ACGATGACCG 600 

35 

CATTAAATAT CTTGAAGATA TGTATGCAGA AAAAGATGTC ACAGTAGTCA AATCAACACC 660 

AGA^AATTTA GCAGAACAAA TTAAGAAAGC AGATGTATTT ATTTCTACAA TTTTAATTTC 720 " 

AGGTGCGAAA CCGCCAAAAT TGGTTACTCG TGAGATGGTT AAATCAATGA AAAAAGGTTC 780 

40 

AGTATTAATC GATATAGCTA TTGACCAAGG TGGAACTATT GAAACAATTA GACCAACTAC 840 

AATTTCTGAT CCAGTGTATG AAGAAGAAGG TGTGATTCAT TATGGTGTAC CAAATCAACC 900 

45 AGGAG CAGTC CCAAGAACTT CAACAATGGC ATTAGCACAA GGAAATATTG ATTATATATT 960 

AGAAATTTGT GACAAAGGCT TAGAACAAGC AATTAAAGAT AATGAAGCCT TAAGTACTGG 1020 

TGTAAACATT TACCAAGGAC AAGTGACAAA TCAAGGATTA GCTTCATCAC ATGACCTAGA 1080 

TTATAAAGAA ATATTAAATG TTATCGAATA GATAGTAATT TAAATGAAAT TGAGTGAAAT 1140 

GAATATTTTA AAT AT AG CAT TATAGTTTGG ACTAAAAATT TACAAAACGG AAGGATGTAA 1200 
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TCGAAGAAGC TAAAGCAAGC ATTAAACCAT TTATTCGTCG AACACCTCTA ATTAAATCAA 1320 

TGTATTTAAG CCAAAGTATA ACTAAAGGGA ATGTATTTCT AAAATTAGAA AATATGCAAT 1-380 

TCACAGGATC TTTTAAATTT AGAGGCGCTA gCAATllAAAA TTAATCACTT AACAGATGAA 144 0 

CAAAAAGAAA AAGGCATTAT CGCAGCATCT GCTGGGgAAC CATGCACAAG GTGTTGCTTT 1500 

AACAGCTAAA TTATTAGGCA TTGATGCAAC GATTGTAATG CCTGAAACAG CACCACAAGC 1560 

GAAACAACAA GCAACAAAAG GCTATGGGGC AAAGGTTATT TTAAAAGGTA AAAACTTTAA 1620 

CGAAACTAGA CTTTATATGG AAGAATTAGC GAAAGAAAAT GGCATGACAA TCGTTCATCC 1680 

ATATGACGAT AAGTTTGTAA TGGCAGGCCA AGGAACAATT GGTTTAGAAA TTTTAGATGA 1740 

TATTTGGAAT GTGAATACAG TCATCGTACC AGTTGGCGGT GGAGGATTAA TTGCAGGTAT 1800 

TGCCACCGCA TTAAAATCAT TTAACCCTTC AATTCATATT ATCGGTGTTC AATCTGAGAA 1860 

TGTTCATGGT ATGGCTGAGT CTTTCTATAA GAGAGATTTA ACTGAACATC GAGTGGATAG 1920 

CACAATAGCA GATGGTTGTG ATGTAAAAGT TCCTGGTGAA CAAAGATATG AAGTAGTTAA 1980 

ACATTTAGTA GATGAATTTA TTCTTGTTAC TGAAGAAGAA ATTGAACATG CTATGAAAGA 2040 

25 TTTAATGCAG CGTGCCAAAA TTATTACTGA AGGTGCAGGC GCATTACCAA CAGCTGCAAT 2100 

TTTAAGTGGA AAAATAAACA ATAAATGGCT TGAAGATAAA AATGTTGTTG CATTAGTTTC 2160 

AGGCGGGAAT . GTTGACTTAA CTAGAGTTTC AGGTGTCATT GAACATGGAC TGAATATTGC 2220 

30 AGATACAAGC AAGGGTGTGG TAGGTTAAAA CATTTAATCT TAAAAATGAG GTGTAATTAT 2280 

GTCAAATGGT AAAGAATTAC AAAAAAATAT AGGTTTCTTC TCAGCGTTTG CTATTGTTAT . 2340 

GGGGACAGTT ATTGGTTCAG GAGTATTCTT TAAAATATCA, AACGTAACAG AAGTAACAGG 2400 

35 AACAGCAGGA ATGG CCTTGT TTGTATGGTT CCTAGGCGGC ATCATTACCA; TTTGTGCGGG "2460 

GTTAACAGCA GCAGAACTTG CTGCTGCAAT CCCTGAAACA GGTGG CTTAA CGAAGTATAT 2520 

AGAATATACA TACGGTGATT TCTGGGGCTT CCTATGAGGT TGGGCGCAAT CATTTATTTA 2580 

40 

TTTTCCAGCT AACGTAGCAG CATTGTCTAT CGTATTTGCG ACACAGCTAA TTAATTTATT 2640 

CCATTTATCT ATAGGTTCGT TAATACCAAT AGCAATCGCA TCTGCGTTAT CTATTGTGTT 2700 

GATAAATTTC CTAGGTTCAA AAGCAGGCGG AATTTTACAA TCAGTTACTT TAGTAATTAA 2760 

45 

ACTGATTCCA ATCATCGTTA TTGTAATTTT TGGTATTTTT CAATCTGGAG ATATCACTTT 2820 

TTCATTAATT CCAACTACAG GTAATTCaGG AAATGGCTTC TTTACAGCAA TTGGTAGTGG 2 880 

SO TTTATTAGCA ACTATGTTTG CATATGATGG TTGGATTCAT GTAGGAAATG TTGCGGGGGA 294 0 

ACTTAAAAAT CCTAAACGCG ATTTACCTTT AG CG ATTTCA GTTGGTATCG GTTGTATTAT 3000 
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TGGTAATTTA 


AATGCAGCTT 


CAGATACATC 


AAAAATATTA 


TTTGGTGAAA 


ATGGCGGTAA 


3120 




GATTATTACA 


ATCGGTATAT 


TAATTTCTGT 


TTATGGTACG 


ATCAATGGCT 


ATAGTATGAC 


3180 


5 


TGGTATGCGC 


GTACCATATG 


CAATGGCTGA 


AAGAAAATTA 


TTGCCATTTA 


GCCATTTATT 


3240 




CGCAAAATTA 


ACAAAATCTG 


GCGCACCATG 


GTTTGGCGCA 


ATTATACAAC 


TTATAATCGC 


3300 




TATCATCATG 


ATGTCAATGG 


GAGCATTTGA 


TACAATTACA 


AATATGTTAA 


TCTTTGTTAT 


33 60 


10 


TTGGTTGTTC 


TATTGTATGT 


CATTTGTTGC 


GGTAATAATT 


TTAAGAAAAC 


GTGAACCAAA 


3420 




TATGGAACGA 


CCATATAAAG 


TACCGTTATA 


TCCGATCATA 


CCTTT AATTG 


CTATTTTGGC 


3480 


15 


AGGATCATTT 


GTATTAATTA 


ATACACTGTT 


TACACAATTT 


ATATTAGCAA 


TCATTGGAAT 


3540 


TCTAATAACA 


GCACTTGGTA 


TACCAGTTTA 


TTACT AT AAA 


AAGAAACAAA 


AAGCAGCATA 


3600 




AGGTAAGATA 


ACTAGCATTG 


AGAATAAATG 


GATGGACTAC 


TAATAAATTT 


AAAGTTTTAC 


3660 


20 


ACATTAAAAT 


CAAAAACCAT 


TCAATTATTC 


TATGGAACAG 


ACAAATTTCT 


GTTATGGAAT 


3720 




TTGTCTGTTT 


TTCAAAAGTA 


TAGGGAGGCA 


AATAGAGATG 


GAAAAGCCGT 


CAAGAGAGGC 


3780 




ATTTGAAGGC 


AATAATAAGT 


TGTTAATAGG 


AATTG TTCTA 


AGTGTAATAA 


CGTTTTGGCT 


3840 


25 


ATTTGCACAA 


TCATTGGTTA 


ATGTTGTACC 


AATACTTGAA 


GATAGTTTCA 


ATACAGATAT 


3900 




TGGAACGGTT 


AATATCGCCG 


TTAGTATAAC 


TGCTTTATTT 


TCAGGAATGT 


TTGTAGTAGG 


3960 




AGCAGGTGGT 


CTTGCTGATA 


AATATGGCAG 


AATTAAACTC 


ACGAACATTG 


GTATTATCTT 


4020 


30 : 


AAATATATTA 


GGTTCATTAT 


TAATCATTAT 


TTCAAATATT 


CCTTTATTAC 


TTATTATAGG 


4080 




AAGATTAATT 


CAAGGACTTT 


CAGCAGCATG 


TATTATGCCT 


GCAACTTTGT 


CTATTATTAA 


4140 




GTCATATTAC 


ATTGGGAAAG 


ATAGACAACG 


CGCTTTAAGT 


TATTGGTCAA 


TTGGCTCATG 


. 4200 


35 ^ 


GGGCGGCTCT 


GGTGTTTGTT 


CATTTTTTGG 


AGGTGCAGTT 


GCAACGCTTT 


TAGGTTGGCG 


4260 




TTGGATTTTC 


ATCCTATCAA 


TTATAATTTC 


ATTAATTGCA 


CTGTTTCTTA 


TTAAAGGCAC 


4320 


40 


ACCTGAAACT 


AAATCTAAAT 


CGATTTCTCT 


AAATAAATTT 


GACATTAAAG 


GTCTGGTTCT 


4380 


TTTAGTCATT 


ATGCTCCTCA 


GTTTAAATAT 


TTTAATTACT 


AAAGGATCAG 


AATTAGGTGT 


4440 




AACCTCACTT 


CTTTTTATTA 


CTTTATTAGC 


TATTGCAATT 


GGATCTTTTA 


GTTTATTTAT 


4500 


45 






CAAATCcl~i r r 


AATCGATTTT 


AAATTATTTA 


AAAATAAAGC 


4560 




TTACACAGGT 


GCAACAGCTT 


CAAACTTTTT 


GTTAAATGGT 


GTTGCAGGAA 


CATTAATAGT 


4620 




AGCCAACACA 


TTTGTTCAAA 


GAGGTTTAGG 


ATATTCTTCA 


TTGCAAGCAG 


GAAGTTTATC 


- 4680 


50 


AATCACTTAT 


TTAGTAATGG 


TACTAATTAT 


GATTCGTGTT 


GGTGAAAAGT 


TACTTCAAAC 


4740 




ACTCGGATGC 


AAGAAACCAA 


TGTTAATTGG 


AACAGGAGTT 


CTTATTGTCG 


GAGAATGTCT 


4800 
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ATTCTTTGGT 


TTAGGACTAG 


GGATATATGC 


TACACCATCA 


ACAGATACAG 


CAATTGCAAA 


4920 




TGCACCGTTA GAAAAAGTAG 


GCGTTGCTGC 


AGGTATCTAT 


AAAATGGCTT 


CTGCATTAGG 


4980 


5 


TGGAGCATTT 


GGCGTCGCAT 


TGAGTGGTGC 


AGTATATGCA 


ATCGTATCAA 


ATATGaCAAA 


5040 




CATTTATACA 


GGTGcAATGa 


TTGnCATTAT GGTTaAATGC 


AGGTATGGGa 


ATATTATCaT 


5100 


10 


TCGTTATCAT 


TTTGtTACTT 


GTGc CTAAAC 


mAAAGGACAC 


TCAATTATGA 


TAATTGAGAA 


5160 


TTAAATTGAA 


ATCATACAAG 


TCGCTACAAT 


ATTAAACAAA 


AATATAAACC 


GATTCTTATG 


5220 




TGTCATTATT 


TTAAATGAAC 


ATAGGGATTG 


" GTTTTTTATT : ACTCTTTTAC 


GCTACTTTAT 


5280 


1S 


TTATAATTAT 


TATAAATTGT 


CACAAATTCA 


ATTTACCTTA 


CAATATATTT 


TGTGTTATTA 


5340 




TATTCTGGAG 


CATAAATAAA 


TTGTTCAACA 


CATAGTTGTA ATGTGTTTCA ATACITITO 






GATAGATTGC 


GAAATTGTAT 


TGAATCGTCA 


TCGTTTTAAA 


TTTTTAAATG 


AGAATGGAAT 


CA C A 


20 


GAGCATTACA 


ATACACAAGC 


AATCAAAAGT 


AAATACATTC 


ACAACACAAC 


AGAGACATAA 






CAACAAGATA 


AGGAGTGAAC 


AATAGCTGTG 


AATTATCGTG 


ATAAAATTCA 


AAAGTTTAGT 






ATTCGTAAAT 


ATACAGTTGG 


TACATTTTCA 


ACTGTCATTG 


CGACATTGGT 


ATTTTTAGGA 


3040 


25 


TTCAATACAT 


CACAAGCACA 


TGCTGCTGAA 


ACAAATCAAC 


CAGCAAGCGT 


GGTTAAACAG 






AAACAACAAA 


GTAATAATGA 


ACAGACTGAG 


AATCGAGAAT 


CTCAAGTACA 


AAATTCTCAA 


5760 




AATTCACAAA ATGGTCAATC 


ATTATGTGCT 


ACTCATGAAA 


ATGAG CAACC 


AAATATTAGT 


coin 


30 


CAAGCTAATT 


TAGTAGATCA 


AAAAGTAGCG 


CAATCATCTA 


CTACTAATGA 


TGAACAACCA 


COQA 

DboU 




GCATCTCAAA 


ATGTAAATAC 


AAAGAAAGAT 


TCGGCAACGG 


CTGCGACAAC 


ACAACCAGAT 




35 


AAAGAACAAA 


GTAAGGATAA ACAAAACGAA 


AGTCAATCTG 


CTAATAAAAA 


TGGAAACGAC 


6000 


AATAGAGCGG 


CTCATGTAGA 


AAATCATGAA 


GCAAATGTAG 


TAACAGCTTC 


AGATTCATCT 


6060 




GATAATGGTA 


ACGTACAACA 


TGACCGAAAT 


GAATTACAAG 


CGTTTTTTGA 


TGCAAATTAT 


6120 


40 


CATGATTATC 


GCTTTATTGA 


CCGTGAAAAT 


GCAGATTCTG 


GCACATTTAA 


CTATGTAAAA 


6180 




GGCATTTTTG 


ATAAGATTAA 


TACGTTATTA 


GGCAGTAATG 


ATCCAATAAA 


CAATAAAGAC 


6240 




TTGCAACTTG 


CATACAAAGA 


ATTGGAACAA 


GCTGTTGCTT 


TAATTCGTAC 


AATGCCTCAA 


6300 


45 


CGTCAACAGA 


CTAGCCGACG 


TTCAAATAGA ATTCAAACGC 


GTTCGGTTGA 


GTCAAGAGCT 


6360 




GCAGAGCCTA 


GATCAGTATC 


AGACTATCAA 


AATGCAAATT 


CAT CAT ATTA 


TGTTGAAAAT 


6420 




GCTAATGATG 


GTTCGGGCTA 


TCCTGTTGGT 


ACATATATCa 


ATGCTTCTAG 


TAAAGGGGCG 


6480 


SO 


CCATATAATT 


TACCAACTAC 


AC CATGG AAT 


ACATTGAAGG 


CCTCTGACTC 


AAAGGAAATT 


6540 




GCTCTTATGA 


CAGCGAAACA 


AACTGGAGAC , 


GGGTACCAAT 


GGGTTATTAA 


GTTTAATAAA 


6600 
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GTAGGAAGAA 


CTGACTTTGT 


AACAGTTAAT 


TCAGATGGAA 


CAAATGTACA 


ATGGAGTCAT 


6720 




GGAGCAGGAG 


CAGGTGCAAA 


TAAACCACTT 


CAACAAATGT 


GGGAATATGG 


AGTAAATGAT 


6780 


5 


CCTCATCGTT 


CACATGACTT 


TAAAATAAGA 


AATAGAAGTG 


GCCAAGTAAT 


ATATGACTGG 


6840 




CCAACTGTCC 


ATATTTATTC 


TTTAGAAGAT 


TTATCTAGAG 


CGAGTGATTA 


TTTTAGTGAA 


6900 




GCTGGAGCGA 


CACCTGCTAC 


TAAAGCTTTT 


GGTAGACAAA 


ATTTTGAATA 


TATTAATGGT 


6960 


10 


CAAAAACCTG 


CTGAATCACC 


GGGTGTTCCT 


AAAGTTTATA 


CTTTCATCGG 


TCAAGGTGAT 


7020 




GCAAGTTATA 


CAATTTCATT 


TAAAACACAA 


GGTCCAACTG 


TTAATAAATT 


GTACTATGCA 


7080 


15 


GCAGGTGGGC 


GTGCTTTAGA 


GTACAATCAA 


TTATTTATGT 


ACAGTCAACT 


ATACGTCGAA 


7140 


TCAACGCAAG 


ACCATCAACA 


ACGTCTTAAT 


GGTTTAAGAC 


AAG TGGTTAA 


TCGTACATAT 


7200 




CGCATAGGTA 


CAACTAAACG 


TGTAGAAGTG 


AGTCAAGGAA ATGTACAAAC 


GAAAAAGGTA 


7260 


20 


TTAGAAAGTA 


CAAACCTAAA 


TATAGATGAT 


TTTGTTGATG 


ATCCTTTAAG 


TTATGTTAAG 


7320 




ACGCCGAGTA 


ATAAAGTGTT 


AGGATTTTAT 


TCGAATAATG 


CAAATACTAA 


TGCTTTTAGA 


7380 




CCGGGTGGAG 


CCCAACAATT 


AAATGAATAT 


CAATTAAGTC 


AATTATTTAC 


TGATCAAAAA 


7440 


25 


TTACAAGAAG 


CAGCAAGAAC 


TAGAAACCCA 


ATAAGATTAA 


TGATTGGTTT 


CGACTATCCT 


7500 




GATGCTTATG 


GTAATAGTGA 


AcTTTAGTTC 


CTGTTAACTT 


AACGGTATTA 


CCTGAAATCC 


7560 




AACATAATAt 


TaAATTCTTT 


AAAAATGACG 


ATACTCAAAA 


TATTGGTGAA 


AAACCATTTT 


7620 


30 


CAAAACAAGC 


TGGGCATCCA 


GTTTTCTATG 


TATATGCAGG 


TAACCAAGGG 


AATG CTTCCG 


7680 




TGAATTTAGG 


TGGTAGCGTA 


ACATCTATTC 


AACCATTACG 


TATTAATTTA 


ACAAGTAATG 


7740 




AGAATTTTAC 


AGATAAAGAT 


TGGCAAATTA 


CAGGTATTCC 


GCGTACATTA 


CACATTGAAA 


7800 


35 


ACTCGACAAA 


TAGACCTAAT 


AATGCCAGAG 


AACGCAATAT 


TGAACTTGTT 


GGTAACTTAT 


7860 . 




TACC&GGGGA 


TTACTTTGGA 


ACGATACGTT 


TTGGACGTAA 


AGAACAATTA 


TTCGAAATTC 


7920 


40 


GTGTTAAACC 


ACATACACCA 


ACAATTACAA 


CGACAGCTGA 


G CAATT AAGA 


GGTACAGCAT 


7980 


TACAAAAAGT 


GCCTGTTAAT 


ATTTCGGGAA 


TACCGTTGGA 


TCCATCGGCA 


TTGGTTTATT 


8040 




TAGTTGCACC 


AACAAATCAA 


ACTACGAATG 


GTGGTAGTGA 


GGCAGATCAA 


ATACCATCTG 


8100 


45 


GTTATACGAT 


ACTTG CGACT 


GGTACACCTG 


ATGGGGTGCA 


TAATACAATT 


ACTATACGAC 


8160 




CGCAAGATTA 


TGTTGTATTC 


ATACCACCTG 


TAGGTAAACA 


AATTAGAGCA 


GTAGTTTATT 


8220 




ATAATAAAGT 


AGTTGCATCT 


AATATGAGTA 


ATGCTGTTAC 


TATTTTGCCA 


GATGACATTC 


8280 


50 


CACCAACAAT 


CAATAAT CCT 


GTTGGAATAA 


ATGC CAAATA 


CTATCGAGGC 


GACGAAkCAA 


8340 




CTTTACAATG 


GGTGTCTCTG 


ATAGACATTC 


TGGTATAAAA 


AATACAACTA 


TTACGACATT 


8400 
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TACAGGTAGA GTGAGTATGA ATCAGGCATT TAACAGTGAT ATTACATTTA AAGTGTCAGC 8520 

GACAGaCAAT GTCAATAATA CGACAAATGA TAGTCAATCT AAACATGTTT CAATTCATGT 8580 

5 AGGTAAAATT AGTGAAGATG CTCATCCGAT TGTATTAGGA AATACTGAGA AAGTTGTAGT 8640 

AGTCAATCCG ACTGCTGTAT CTAATGATGA AAAGCAAAGC ATAATTACTG CCTTTATGAA 8700 

TAAAAACCAA AATATAAGAG GATATTTAGC ATCAACTGAT CCAGTAACTG TCGATAATAA 8760 

10 

TGGTAATGTC ACATTACATT ACCGTGATGG CTCATCGACA ACGCTTGATG CTACAAATGT 8820 

GATGACATAC GAACCAGTTG TGAAACCTGA 1 ATACCAAACT GTCAATGCTG CTAAAACAGC 8860 

AACGGTAACG ATTGCTAAAG GACAATCATT TAGTATTGGT GATATTAAAC AATATTTTAC 8940 

15 

TTTAAGTAAT GGACAACCTA TTCCAAGTGG CACATTTACA AATATTACAT CTGATAGAAC 9000 

TATTCCAACT GCACAAGAAG TTAGTCAAAT GAACGCAGGC ACGCAGTTAT ACCATATAAC 9060 

20 TGCTACAAAT GCGTATCATA AAGATAGTGA AGACTTCTAT ATTAGTTTGA AAATCATCGA 9120 

TGTGAAACAA CCAGAAGGCG ATCAACGTGT ATATCGTACA TCAACATATG ATTTAACTAC -9180 

TGATGAAATC TCAAAAGTAA AACAAGCATT TATTAATGCA AATAGAGATG TAATTACGCT 9240 

25 TGCCGAAGGT GATATTTCAG TTACAAATAC ACCTAATGGT GCTAATGTAA GTACTATTAC 9300 

AGTAAATATT AATAAAGGTC GATTAAOGAA ATCATTCGCG TCAAACCTAG CTAATATGAA 9360 

TTTCTTGCGT TGGGTTAATT TCCCACAAGA TTATACAGTG ACATGGACGA ATGCAAAAAT - 9420 

30 TGCAAACAGA C CAACAG ATG GTGGTTTATC ATGGTCTGAT ^ GACCATAAAT CTTTAATTTA 9480 

TCGTTATGAT GCTACATTAG GTACTCAAAT TACGACGAAT GAT ATTTTAA CAATGTTAAA 9540 

AGCAACAACT ACAGTGCCTG GATTGCGAAA TAACATTACT GGTAATGAAA AATCACAAGC 96 00 

35 

AGAAGCTGGC GGAAGACCTA ACTTTAGAAC GACTGGTTAT * TCACAATCAA' ATGCGACAAC 9660 

TGATGGTCAA CGTCAATTTA CGTTGAATGG TCAAGTGATT CAAGTGTTAG ACATCATCAA 9720 

CCCTTCAAAC GGTTATGGTG GGCAACCTGT TACAAATTCA AATACTCGTG CAAACCATAG 97 80 

40 

TAACTCAACT GTTGTTAAGG TAAACGAACC GGCAGCTAAT GGTGc TGGCG CATTTACAAT 984 0 

TGACCACGTT GTAAAAAGTA ATTCTACACA - TAATGCAAGT GATGCAGTTT ATAAAGCACA 9900 

4S GTTATACTTA ACGCCATATG GTCCAAAACA ATATGTTGAA CATTTAAATC AAAATACAGG 9960 

AAAT ACTACT GACGCTATTA ACATTTATTT TGTACCAAGT GACTTAGTGA ATG CAACAAT 10020 

TTCAGTAGGT AATTACACTA ATCATCAAGT GTTCTCAGGT GAAACATTTA CAAATACTAT 100 80 

50 TACAGCGAAT GATAACTTTG GTGTGCAATC TGTAACTGTA CCAAATACAT CACAAATTAC 10140 

AGGTACTGTT GATAATAACC ATCAACATGT TTCTG CAACG GCACCAAATG TGACATCAGC 10200 
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10 



is 



20 



2S 



30 



35 



40 



SO 



GTTCAATGTA ACAGTGAAAC CTTTGCGTGA TAAATATCGA GTTGGTACTT CATCAACGGC 103 20 

TGCTAATCCT GTGAGAATTG CCAATATTTC GAATAATGCG ACAGTATCAC AAGCTGATCA 10380 

AACGACAATT ATTAATTCGT TAACGTTTAC TGAAACAGTA CCAAATAGAA GTTATGCAAG 10440 

AGCAAGTGCG AATGAAATCA CTAGTAAAAC AGTTAGTAAT GTCAGTCGTA CTGGAAATAA 10500 

TGCCAATGTg CACAGTAACT GTTACTTATC AAGATGGAAC AACATCAACA GTGACTGTAC 10560 

CTGTAAAGCA TGTCATTCCA GAAATCGTTG CACATTCGCA TTACACTGTA CAAGGCCAAG 10620 

ACTTCCCAGC AGGTAATGGT TCTAGTGCAT CAGATTACTT TAAGTTATCT AATGGTAGTG 10680 

ACATTGCAGA TGCAACTATT ACATGGGTAA GTGGACAAGC GCCAAATAAA GATAATACAC 10740 

GTATTGGTGA AGATATAACT GTAACTGCAC ATATCTTAAT TGATGGCGAA ACAACGCCGA 10800 

TTACGAAAAC AGCAACATAT AAAGTAGTAA GAACTGTACC GAAACATGTC TTTGAAACAG 10860 

CCAGAGGTGT TTTATACCCA GGTGTTTCAG ATATGTATGA TGCGAAACAA TATGTTAAGC 10920 

CAGTAAATAA TTCTTGGTCG ACAAATGCGC AACATATGAA TTTCCAATTT GTTGGAACAT 10980 

ATGGTCCTAA CAAAGATGTT GTAGGCATAT GTACTCGTCT TATTAGAGTG ACATATGATA 11040 

ATAGACAAAC AGAAGATTTA ACTATTTTAT CTAAAGTTAA ACCTGACCCA CCTAGAATTG 11100 

ACGCAAACTC TG TG ACAT AT AAAGCAGGTC TTACAAACCA AGAAATTAAA GTTAATAACG 11160 : 

TATTAAATAA CTCGTCAGTA AAATTATTTA AAGCAGATAA TACACCATTA AATGTCACAA 11220 

ATATTACTCA TGGTAG CGGT TTTAGTTCGG TTGTGACAGT AAGTGACGCG TTACCAAATG 11280 

GCGGAATTAA AGCAAAATCT TCAATTTCAA TGAACAATGT GACGTATACG ACGCAAGACG 1134 0 

AACATGGTCA AGTTGTTACA GTAACAAGAA ATGAATCTGT TGATTCAAAT GACAGTGCAa 11400 

CAGTAACAGT GACACCACAA TTACAAGCAA CTACTGAAGG CGCTGTATTT ATTAAAGGTG 11460 

GCGA&GTTT TGATTTCGGA CACGTAGAAA GATTTATTCA AAACCCGCCA CATGGGGCAA 11520 

CGGTTGCATG GCATGATAGT CCAGATACAT GGAAGAATAC AGTCGGTAAC ACTCATAAAA 11580 

CTGCGGTTGT AACATTACCT AATGGTCAAG GTACX3CGTAA TGTTGAAGTT CCAGTGAAAG 11640 

TTTATCCAGT TGCTAATGCA AAGGCGCCAT CACGTGATGT GAAAGGTCAA AATTTGACTA 11700 

ATGGAACGGA TGCGATGAAC TACATTACAT TTGATCCAAA TACAAACACA AATGGTATCA 11760 

CTGCAGCATG GGCAAATAGA CAACAACCAA ATAACCAACA AGCAGGCGTG CAACATTTAA 11820 

ATGTCGATGT C ACAT AT CCA GGTATTTCAG CTGCTAAACG AGTTCCTGTT ACTGTTAATG 11880 

TATATCAATT TGAATTCCCT CAAACTACTT ATACGACAAC GGTTGGAGGC ACTTTAG CAA 11940 

GTGGTACGCA AG CAT CAGG A TATGCACATA TGCAAAATGC TACTGGTTTA CCAACAGATG 12000 
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TGAATAAACC 


GAATGTGGCT 


AAAGTCGTTA 


ACGCAAAATA 


TGACGTCATC 


TATAACGGAC 


12120 


ATACTTTTGC 


AACATCTTTA 


CCAGCGAAAT 


TTGTAGTAAA 


AGATGTGCAA 


CCAGCGAAAC 


12180 


CAACTGTGAC 


TGAAACAGCG 


GCAGGAGCGA 


TTACAATTGC 


ACCTGGAGCA 


AACCAAACAG 


12240 


TGAATACACA 


TGCCGGTAAC 


GTAACGACAT 


ACGCTGATAA 


ATTAGTTATT 


AAACGTAATG 


123 00 


GTAACGTTGT 


GACGACATTT 


ACACGTCGCA 


ATAATACGAG 


TCCATGGGTG 


AAAG AAG CAT 


12360 


CTGCAGCAAC 


TGTAGCAGGT 


ATTGCTGGAA 


CTAATAATGG 


TATTACTGTT 


GCAGCAGGTA 


12420 


CTTTCAACCC 


TGCTGATACA 


ATTCAAGTTG 


TTGCAACGCA 


AGGAAGCGGA 


GAGACAGTGA 


12480 


GTGATGAGCA 


ACGTAGTGAT 


GATTTCACAG 


TTGTCGCACC 


ACAACCGAAC 


CAAGCGACTA 


12540 


CTAAGATTTG 


GCAAAATGGT 


CATATTGATA 


TCACGCCTAA 


TAATCCATCA 


GGACATTTAA 


12600 


TTAATCCAAC 


TCAAGCAATG 


GATATTGCTT 


ACACTGAAAA 


AGTGGGTAAT 


GGTGCAGAAC 


12660 


ATAGTAAGAC 


AATTAATGTT 


GTTCGTGGTC 


AAAATAATCA 


ATGGACAATT 


GCGAATAAGC 




CTGACTATGT 


AACGTTAGAT 


GCACAAACTG 


GTAAAGTGAC 


GTTCAATGCC 


AATACTATAA 




AACCAAATTC 


ATCAATCACA 


ATTACTCCGA 


AAGCAGGTAC 


AGGTCACTCA 


GTAAGTAGTA 


1 OQArt 
± 4 \J 


ATCCAAGTAC 


ATTAACTGCA 


CCGGCAGCTC 


ATACTGTCAA 


CACAACTGAA ATTGTGAAAG 




ATTATGGTTC 


AAATGTAACA 


GCAGCTGAAA 


TTAACAATGC 


AGTTCaAGTT 


GCTAATAAAC 


i.Z jDU 


GTACTGCAAC 


GATTAAAAAT 


GGCACAGCAA 


TGCCTACTAA 


TTTAGCTGGT 


GGTAGCACAA 




CGACGATTCC 


TGTGACAGTA ACTTACAATG ATGGTAGTAC 


TGAAGAAGTA CAAGAGTCCA 


1JUDU 


TTTTCACAAA 


AGCGGATAAA 


CGTGAGTTAA 


TCACAGCTAA 


AAATCATTTA 


GATGATCCAG 




TAAGCACTGA 


AGGTAAAAAG 


CCAGGTACAA ' TTACGCAGTA CAATAATGCA ATGCATAATG 


13200 


CGCAACAACA 


AATCAATACT 


GCGAAAACAG 


AAGCACAACA 


■AGTGATTAAT AATGAGCGTG 


13260 


CAACACCACA 


ACAAGTTTCT 


GACGCACTAA 


CTAAAGTTCG 


TGCAGCACAA 


ACTAAGATTG 


13320 


ATCAAGCTAA 


AGCATTACTT 


CAAAATAAAG 


AAGATAATAG 


CCAATTAGTA 


ACGTCTAAAA 


133 80 


ATAACTTACA 


AAGTTCTGTG 


AACCAAGTAC 


CATCAACTGC 


TGGTATGACG 


CAACAAAGTA 


13440 


TTGATAACTA 


TAATGCGAAG 


AAGCGTGAAG 


CAGAAACTGA 


AATAACTGCA 


GCTCAAGGTG 


13500 


J. Inl lljiftV-M>V 


TGGCGATGCA 


ACTGCACAAC 


AAATTTCAGA 


TGAAAAACAT 


CGTGTCGATA 


13560 


ACGCATTAAC 


AG CATT AAA C 


CAAGCGAAAC 


ATGATTTAAC TGCAGATACA 


CATGCCTTAG 


13620 


AGCAAGCAGT 


GCAACAATTG 


AATCGCACAG 


GTACAACGAC 


TGGTAAGAAG 


CCGGCAAGTA 


13680 


TTACTGCTTA 


CAATAATTCG 


ATTCGTGCAC 


TTCAAAGTGA 


CTTAACAAGT 


GCTAAAAATA 


13740 


GCGCTAATGC 


TATTATTCAA 


AAGCCAATAA 


GAACAGTACA 


AGAAGTG CAA 


TCTGCGTTAA 


13800 
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CTGATAATAG TGCTTTAAAA ACTGCTAAGA CGAAACTTGA TGAAGAAATC AATAAATCAG 13920 

TAACTACTGA TGGTATGACA CAATCATCAA TCCAAGCATA TGAAAATGCT AAACGTGCGG 13980 
GTCAAACAGA ATCAACAAAT GCACAAAATG TTATTAACAA TGGTGATGCG ACTGACCAAC • 14040 

AAATTGCCGC AGAAAAAACA AAAGTAGAAG AAAAATATAA TAGCTTAAAA CAAGCAATTG 14100 

CTGGATTAAC TCCAGACTTG GCACCATTAC AAACTGCAAA AACTCAGTTG CAAAATGATA 14160 

TTGATCAGCC AAGGAGTACG ACTGGTATGA CAAGCGCATC TATTGCAGCA TTTAATGAAA 14220 

AACTTTCAGC AGCTAGAACT AAAATTCAAG AAATTGATCG TGTATTAGCC TCACATCCAG 14280 

ATGTTGCGAC AATACGTCAA AACGTGACAG CAGCGAATGG CGCTAAATCA GCACTTGATC 14340 

AAGCACGTAA TGGCTTAACA GTCGATAAAG CGCCTTTAGA AAATGGGAAA AATCAACTAC 144 00 

AACATAGT AT TGACAGGCAA ACAAGTACAA CTGGTATGAC ACAAGACTCT ATAAATGCAT 144 6 0 , 

ACAATGCGAA GTTAACAGCT GCACGTAATA AGATTCAACA AATCAATCAA GTATTAGCAG \ 14520 

GTTCACCGAC TGTAGAACAA ATTAATACAA ATACGTCTAC AGCAAATCAA GCTAAATCTG . 14580 

ATTTAGATCA TGGACGTCAA GCTTTAAGAC CAGATAAAGC GCCGCTTCAA ACTGCGAAAA 1464 0 

CGCAATTAGA ACAAAGCATT AATCAACCAA CGGATACAAC AGGTATGACG ACCGCTTCGT 14700 

TAAATGCGTA CAACCAAAAA TTACAAGCAG CGCGTCAAAA GTTAACTGAA ATTAATCAAG 14760 

TGTTGAATGG CAACCCAACT GTCCAAAATA TCAATGATAA AGTGACAGAG GCAAACCAAG 14820 

CTAAGGATCA ATTAAATACA GCACGTCAAG GTTTAACATT AGATAGACAG C CAGCGTTAA 14880 

CAACATTACA TGGTG CATCT AACTTAAACC. AAGCACAACA AAATAATTTC ACGCAACAAA 1494 0 

TTAATGCTGC TCAAAATcAT Get GCGCTTG AAACAATTAA GTCTAACATT ACGGCTTTAA 15000 

ATACTGCGAT GACGAAATTA AAAGAGAGTG TTGCGGATAA TAATACAATT AAATGAGATC 15060 

AAAATTACAC TGACGCAACA CGAGGTAATA AACAAGCGTA TGATAATGCA GTTAATGCGG 15120 

CTAAAGGTGT CATTGGAGAA ACGACTAATC CAACGATGGA TGTTAACACA GTGAACCAAA 15180 

AAGCAGCATC TGTTAAATCG ACGAAAGATG CTTTAGATGG TCAACAAAAC TTACAACGTG ! .15240 

CGAAAACAGA AG C AACAAAT GCGATTACGC ATGCAAGTGA TTTAAACCAA GCACAAAAGA 15300 

ATGCATTAAC ACAACAAGTG AATAGTGcAC AAAACGTGCA AGCAGTAAAT GATATTAAAC 15360 

AAACGACTCA AAGCTTAAAT ACTGCTATGA CAGGTTTAAA ACGTGGCGTT GCTAATCATA 15420 

ACCAAGtCGT ACAAAGTGAT AATTATGTCA ACGCAGATAC TAATAAGAAA AATGATTACA 15480 

ACAATGCATA CAACCATGCG AATGACATTA TTAATGGTAA TGCACAACAT CCAGTTATAA 15540 

CACCAAGTGA TGTTAACAAT GCTTTATCAA ATGTCACAAG TAAAGAACAT GCATTGAATG 15600 
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ATTTAAATAA TGCACAACGT CAAAACTTAC 
ATGCAGTTAA TACAATTAAG CAAAATGCAA 
5 GACAAGCTGT TGCAGATAAA GATCAAGTGA 

CAGCTAAACA AAATGCATAT AACAGTG C AG 
CAACAAATCC AACGATGTCT GTTGATGATG 

10 

ATAAAAATGC ATTAAATGGT TATGAAAAAT 
CAATTGATGC ATTACCACAT TTAAATAATG 
ATGCTGCATC AAATATTGCT GGCGTAAATA 

75 

CAX.CGATGGg TAACTTGCAA GGTGCAATCA 
ACTATCAAGA TGCGACACCT AGTAAGAAAA 

2Q AAGATATTTT AAATAAATCA AATGGTCAAA 

TGAATCAAGT GAATTCTGCT AAAAATAACT 
nCAAaCAGCA AAACAGCAGT TAAATAATAT 

25 TTTAACAAAC CAAATTAATA GTGGTACTAC 

TGCCAATACA TTAGATCAAG CCATGAATAC 
GACTAAAGCA AGTGAAGATT ACGTAGATGC 

30 ' CGCAGTAGCT GCTGCTGAAA 'CGATTATTAA 

TACGATTACA CAAAAAGCAG AGCAAGTGAA 
- * AAACTTAGCT GCTGCAAAAC : AAAATGCGAA 

35 

AGATGCTCAA AAGAACAATT TGATTAGTCA 
TGAXACTGTA AAACAAAATG CGCAACATCT 
TATTAACAAC GAATCTCAAG TGAAATCATC 

40 

ACAACAAGAG TATGATAATG CTATTACTGC 
TCCAAACACT GCGCAAAATG CAGTTGAAGC 

45 TGCATTGAAT GGTGATGCAA AATTAATTGC 

TACTTTAACG CATATCACTA CAGCTCAACG 
TACAAACTTA GCTGGTGTTG AATCTGTTAA 

BO GGGTAACTTA CAAACGGCTA TCAACGATAA 

GGATGCTGAT GAGCAAAAAC GTAATGCATA 



AATCGCAAAT TAATGGTGCG CATCAAATTG . 15720 

CAAACTTGAA TAGTGCAATG GGTAACTTAA 15780 

AACGTACAGA AGATTATGCG GATG CAGATA , 15840 

TTTCAAGTGC CGAAACAATC ATTAATCAAA 15900 

TTAATCGTGC AACTTCAGCT GTTACTTCTA 15960 

TAG CACAATC TAAAACAGAT GCTGCAAGAG 16020 

CACAAAAAGC AGATGTTAAA TCTAAAATTA . 16080 

CTGTTAAACA ACAAGGTACA GATTTAAATA 1614 0 

ATGATGAACA AACGACGCTT AATAGTCAAA 16200 

CAGCATACAC AAATGCGGTA CAAGCTGOGA 16260 

ATAAAACGAA AGATCAAGTT ACTGAAGCGA 16320 

TAGATGGTAC GCGTTTATTA GATCAAGCGA 16380 

GACGCATTTA ACAACTGCAC AAAAAACGAA 1644 0 

TGTCGCTGGT GTTCAAACGG TTCAATCAAA 16500 

GTTAAGACAA AGTATTGCCA ACAAAGATGC , 16560 

TAATAATGAT AAGCAAACAG , CATATAACAA 16620 

TGCTAATAGT AATCGAGAAA* TGAATCCAAG' : 16680 

TAGTTCTAAA ACGGCACTTA ACGGTGATGA : 16740 

AACGTACTTA -AACACATTGA CAAGTATTAC ; .16800 

AATTACTAGT, GCGACAAGAG TGAGTGGTGT 1 16860 

AGACCAAGCT ATGGCTAGCT TACAGAATGG 16920 

TGAGAAATAT CGTGATGCTG ATACAAATAA 16980 

AGCGAAAGCG ATTTTAAATA AATCGACAGG 1704 0 

AGCATTACAA CGTGTTAATA ATGCGAAAGA = 17100 

AGCTCAAAAC GCAGCGAAAC AACATTTAGG 17160 

TAATGATTTA ACAAATCAAA TTTCACAAGC 17220 

ACAAAATGCG AATAGTTTAG ATGGTGCTAT 17280 

GTCAGGAACA TTAG CGAGCC AAAACTTCTT 1734 0 

CAATCAAGCT GTATCAGCAG CCGAAACCAT 17400 
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TGtTAATAAT 


GCGAAACATG 


CATTAAATGG 


TACGCAAAAC 


TTAAACAATG 


CGAAACAAGC 


17520 






AGCGATTACA 


GCAATCAATG 


GCGCATCTGA 


TTTAAATCAA 


AAACAAAAAG 


ATGCATTAAA 


17580 




5 


AGCACAAGCT 


AATGGTGCTC 


AACGCGTATC 


TAATGCACAA 


GATGTACAGC 


ACAATGCGAC 


17640 






TGAACTGAAC 


ACGGCAATGG 


GCACATTAAA 


ACATGCCATC 


GCAGATAAGA 


CGAATACGTT 


17700 






AGCAAGCAGT 


AAATATGTTA 


ATGCCGATAG 


GACTAAACAA 


AATGCTTACA 


CAACTAAAGT 


17760 




10 


TACCAATGCT 


GAACATATTA 


TTAGCGGTAC 


GCCAACGGTT 


GTTACGACAC 


CTTCAGAAGT 


17820 






AACAGCTGCA 


GCTAATCAAG 


TAAACAGCGC 


GAAACAAGAA 


TTAAATGGTG 


ACGAAAGATT 


17880 




IS 


ACGTGAAGCA 


AAACAAAACG 


CCAATACTGC 


TATTGATGCA 


TTAACACAAT 


TAAATACACC 


17940 




TCAAAAAGCT AAATTAAAAG 


AACAAGTGGG 


AGAAGCCAAT 


AGATTAGAAG 


ACGTACAAAC 


18000 






TGTTCAAACA AATGGACAAG 


CATTGAACAA 


TGCAATGAAA 


GGCTTAAGAG 


ATAGTATTGC 


18060 




on 


TAACGAAACA 


ACAGTCAAAA 


CAAGTCAAAA 


CTATACAGAC 


GCAAGTCCGA 


ATAACCAATC 


18120 






AACATATAAT 


AGCGCTGTGT 


CAAATGCGAA 


AGGTATCATT 


AATCAAACTA 


ACAATCCGAC 


18180 






TATGGATACT 


AGTGCGATTA 


CCCAAGCTAC 


AACACAAGTG 


AATAATGCTA 


AAAATGGTTT 


18240 




25 


AAACGGTGCT 


GAAAACTTAA 


GAAATGCACA 


AAACACTGCT 


AAGGAAAACT 


T AAAT ACATT 


18300 






ATCACACTTA 


ACAAATAACC 


AAAAATCTGC 


CATCTCATCA 


CAAATTGATC 


GTGCAGGTCA 


18360 


-' T 




TGTGAGTGAG 


GTAACTGCTA 


CTAAAAATGC 


AGCAACTGAG 


TTGAATACGC 


AAATGGGTAA 


18420 


**Z 


30 


CTTGGAACAA 


GCTATCCATG 


ATCAAAACAC 


AGTTAAACAA 


AGTGTTAAAT 


TTACTGATGC 


18480 


- vv 




AGATAAAGCT 


AAACGTGATG 


CGTATACAAA 


TGCGGTAAGC 


AGAGCTGAAG 


CAATTCTGAA 


18540 






TAAAACGCAA 


GGTGCAAATA 


CGTCTAAACA 


AGATGTTGAA 


GCGGCTATTC 


AAAATGTTTC 


18600 




35 


AAGTGCTAAA 


AATGCATTGA 


ATGGTGATCA AAACGTTACA 


AATGCGAAGA 


ATGCAGCTAA 


18660 






AAATGCATTA AATAACTTAA 


CGTCAATTAA 


TAATGCACAA 


AAACGTGACT 


TAACAACTAA 


18720 




40 


AATTGATCAA 


GCAACAACTG 


TAGCTGGTGT TGAAGCTGTA 


TCTAATACG A 


GTACACAATT 


18780 




GAAtACAGCG 


ATGGCTAACT 


TGCAAAATGG 


TATTAATGAT 


AAAACAAATA 


CACTAGCAAG 


18840 






TGAAAACTAT 


CATGATGCTG 


ATTCAGATAA 


GAAAACTGCT 


TATACTCAAG 


CCGTTACGAA 


18900 




45 


CGCAGAAAAT 


ATTTTAAATA 


AAAATAGTGG 


ATCAAATTTA 


GACAAAACTG 


CCGTTGAAAA 


18960 






CGCGTTGTCA 


CAAGTTGCTA 


ATGCGAAAGG 


TGCCCTAAAT 


GGTAACCATA 


ATTTAGAGCA 


19020 






AGCTAAATCA 


AATGCAAACA 


CTACTATAAA 


CGGACTTCAA 


CATTTAACAA 


CTGCTCAAAA 


19080 




SO 


AGATAAATTG 


AAACAACAAG 


TGCAACAAGC 


ACAAAATGTT 


GCAGGTGTAG 


ATACTGTTAA 


19140 






ATCAAGTGCC 


AACACATTAA 


ATGGTGCTAT 


GGGTACGTTA 


AGAAATAGCA 


TACAAGATAA 


19200 
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TAACAATGCT 


GTTGATAGTG 


CTAATGGTGT 


CATTAATGCA ACAAGCAATC 


CAAATATGGA 


19320 




TGCTAATGCA 


ATTAACCAAA 


TCGCTACACA 


AGTGACATCA 


ACGAAAAATG 


CATTAGATGG 


19380 


5 


TACACATAAT 


TTAACGCAAG 


CGAAACAAAC 


AGCAACAAAT 


GCCATCGATG GTGCTACTAA 


1944 0 




CTTAAATAAA 


GCGCAAAAAG 


ATGCGTTAAA 


AGCACAAGTT 


ACAAGTGCGC 


AACGTGTTGC 


19500 


10 


AAATGTAACA 


AGTATCCAAC 


AAACTGCAAA 


TGAACTTAAT 


ACAGCTATGG 


GTCAATTACA 


19560 


ACATGGTATT 


GATGATGAAA 


ATGCAACAAA 


ACAAACTCAA 


AAATATCGTG 


ACGCTGAACA 


19620 




AAGTAAGAAA ACTGCTTATG 


ATCAAGCTGT 


AGCTGCTGCG 


AAAGCAATTT 


TAAATAAACA 


196B0 


15 


AACAGGTTCA AATTCAGATA 


AAGCAGCAGT 


TGACCGTGCA 


TTACAACAAG 


TAACAAGTAC 


19740 


GAAAGATGCA 


TTGAATGGTG 


ATGCAAAACT 


GGCAGAAGCG 


AAAGCGGCAG 


CTAAACAAAA 


19600 




CTTAGGCACT 


TTAAAC CAT A 


TTACGAATGC 


ACAACGTACT 


GACTTAGAAG 


GCCAAATCAA 


19860 


20 


TCAAGCGACG 


ACTGTTGATG 


GCGTTAATAC 


TGTAAAAACA 


AATGCCAATA 


CATTAGACGG 


19920 




CGCAATGAAT 


AGCTTACAAG 


GTTCAATCAA 


TGATAAAGAT 


GCGACATTAA 


GAAATCAAAA 


19980 




TTATCTTG AT 


GCGGATGAAT 


CAAAACGAAA 


TGCATATACG 


CAAGCTGTCA 


CAGCGGCTGA 


20040 


25 


AGGCATTTTA 


AATAAACAAA 


CTGGTGGTAA 


CACATCTAAA 


GCAGACGTTG 


ATAATGCATT 


20100 




AAATGCAGTT 


ACAAGAGCGA AAGcGgCTTT AAATGGTGCT GACAACTTAA GAAATGCGAA 


20160 




AACTTCAGCA 


ACAAATACGA 


TTGATGGTTT : 


ACCTAACTTA 


ACACAATTAC AAAAAGACAA 


20220 


30 


CTTGAAGCAT 


CAAGTTGAaC 


AAGCGCAAAA 


TGTAGCAGGT GTAAATGGTG 


TTAAAGATAA / 


20280 




AGGTAATACG 


TTAAATACTG 


CCATGGGTGC 


ATTACGTACA 


AGTATCCAAA 


ATGATAATAC 


20340 


35 


GACGAAAACA 


AGTCAAAATT 


ATCTTGATGC ATCTGACAGC . AACAAAAATA ATTACAATAC 


2 04 00 


TGCTGTAAAT 


AATGCAAATG 


GTGTTATTAA 


TGCAACGAAC 


AATCCAAATA TGGATGCTAA v - 


20460 




TGCGATTAAT 


GGCATGGCAA- 


ATCAAGTCAA 


TACAACAAAA 


GCAGCGTTAA 


ATGGTGCACA 


20520 


40 


AAACTTAGCT 


CAAGCTAAAA 


CAAATGCGAC 


GAACACAATT 


AACAACGCAC 


ATGACTTAAA 


20580 


CCAAAAACAA 


AAAGATGCAT 


TAAAAACACA 


AGTTAACAAT 


GCACAACGTG 


TATcTGATGC 


20640 




AAATAACGTT 


CAACACACTG 


CAACTGAATT 


GAACAGTGCG 


ATGACAGCAC 


TTAAAGCAGC 


20700 


45 


TATTGCTGAT 


AAAGAAAGAA 


CAAAAGCAAG 


CGGTAATTAT 


GTCAATGGTG 


ATCAAGAAAA 


20750 




ACGTCAAGCG 


TATGATTCAA 


AAGTGACTAA 


CGCTGAAAAT 


ATCATTAGTG 


GTAGACCGAA 


20820 




TGCGACATTA 


ACAGTCAATG 


ACGTAAATAG 


TGCGGCATCA 


CAAGTCAATG 


CGGCTAAAAC 


20880 


SO 


AGCATTAAAT 


GGTGATAACA 


ACTTACGTGT 


AGCGAAAGAG 


CATGCCAACA 


ATACAATTGA 


20940 




CGGCTTAGCA 


CAATTGAATA 


ATGCACAAAA 


AGCAAAATTA 


AAAGAACAAG 


TTCAAAGTGC 


21000 
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10 



is 



20 



25 



30 



40 



45 



SO 



GAAAGGCTTA 


AGAGATAGTA 


TTGCGAATGA 


AGCAACAATT 


AAAG CAGGTC 


AAAACTACAC 


21120 




TGACGCAAGT 


CCAAATAATC 


GTAACGAGTA 


CGACAGTGCA 


GTTACTGCAG 


CAAAAGCAAT 


21180 




CATTAATCAA 


ACATCGAACC 


CAACGATGGA 


ACGAAATACT 


ATTACGCAAG 


TAACATCACA 


21240 




AGTGACAACT 


AAAGAACAGG 


CATTAAATGG 


TGCGCGAAAC 


TTAGCTCAAG 


CTAAGACAAC 


21300 




TGCGAAAAAC 


AACTTGAATA 


ACTTAACATC 


AATTAACAAT 


GCACAAAAAG 


ATGCGTTAAC 


21360 




GCGTAgcATT GATGGTGCAA 


CAACAGTAGC 


TGGTGTAAAT 


CAAGAAACTG 


CAAAAGCAAC 


21420 




AGAATTAAAT 


AACGCAATGC 


ATAGTTTACA 


AAATGGTATC 


AATGATGAGA 


CACAAACAAA 


21480 




ACAAACTCAG 


AAATACCTAG 


ATGCAGAGCC 


AAGTAAGAAA 


TCAGCTTATG 


ATCAAGCAGT 


21540 




AAATGCAGCG 


AAAGCAATTT 


TAACAAAAGC 


TAGTGGTCAA 


AATGTAGACA 


AAGCAGCAGT 


21600 




TGAACAAGCA 


TTGCAAAATG 


TGAACAGTAC 


GAAGACGGCG 


TTGAACGGTG 


ATGGGAAATT 


21660 




AAATGAAGCT 


AAAGCAGCTG 


CGAAACAAAC 


GTTAGGTACA 


TTAACACACA 


TTAATAATGC 


21720 




ACAACGTACA 


GCGTTAGACA 


ATGAAATTAC 


ACAAGCAACA 


AATGTTGAAG 


GTGTTAATAC 


21780 




AGTTAAAGCC 


AAAGCGCAAC 


AATTAGATGG 


TGCTATGGGT 


CAATTAGAAA 


CATCAATTCG 


21840 




TGATAAAGAC 


ACGACGTTAC 


AAAGTCAAAA 


TTATCAAGAT 


GCTGATGATG 


CTAAACGAAC 


21900 




TGCTTATTCT 


CAAGCAGTAA ATGCAGCAGC AACTATTTTA AATAAAACAg CTGGCGGTAA 


21960 




TACACCTAAA 


GCAGATGTTG 


AAAGAGCAAT 


GCAAGCTGTT 


ACACAAGCAA 


ATACTG CATT 


22020 




AAACGGTATT 


CAmAACTTAG 


ATCGTG CGAA 


ACArGCTGCT 


AACACAGCGA 


TT ACAAATG C 


22080 




TTCGGACTTA 


AATACAAAAC 


mAAAAGAAGC ATTAAAAgCA 


CAAGTAACAA 


GTGCAGGACG 


22140 




TGTATCTGCA 


GCAAATGGTG 


TTGAACATAC 


TGCGACTGAA 


TTAAATACTG 


CX5ATGACAGC 


22200 




TTTAAAGCGT ■ 


GCCATTGCTG 


ATAAAGCTGA 


GACAAAAGCT 


AGTGGTAACT 


ATGTCAATGC 


22260 




TGATCTCGAAT 


AAACGTCAAG 


CATATGATGA 


AAAAGTTACA 


GCTGCCGAAA 


ATATCGTTAG 


22320 




TGGTACACCA 


ACACCAACGT 


TAACACCAGC 


AGATGTTACA 


AATGCAGCAA 


CGCAAGTAAC 


22380 




GAATGCTAAG 


ACG CAGTT AA 


ACGGTAATCA 


TAATTTAGAA 


GTAGCGAAAC 


AAAATGCTAA 


22440 




CACTGCAATT 


GATGGTTTAA 


CTTCTTTAAA 


TGGTCCGCAA 


AAAGCAAAAC 


TTAAAGAACA 


22500 




AGTGGGTCAA GCGACGACGT TGCCAAATGT TCAAACTGTT CGTGATAATG CACAAACATT 


22560 




AAACACTGCA 


ATGAAAGGTC 


TACGAGATAG 


CATTGCGAAT 


GAAGCAACGA 


TTAAAGCAGG 


22620 




TCAAAACTAC 


ACAGATGCAA 


GTCAAAACAA 


ACAAACTGAC 


TAGAACAGTG 


CAGTCACTGC 


22680 




AGCAAAAGCA 


ATCATTGGTC 


AAACAACTAG 


TCCATCAATG 


AATGCG CAAG 


AAATTAATCA 


22740 




AGCGAAAGAC 


CAAGTGACAG 


CTAAACAACA 


AGCGTTAAAC 


GGTCAAGAAA 


ACTTAAGAAC 


22800 
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AGATGCAGTG AAACGTCAAA TCGAAGGTGC AACGCATGTT AATGAAGTAA CACAAGCACA 22920 

AAATAATGCG GATGCaTTAA ATACAGCTAT GACGAACTTG AAAAATGGTA TTCAAGATCA 22 980 

GAATAGGATT AAGCAAGGTG TTAACTTCAC TGATGCCGAC GAAGCGAAAC GTAATGCATA 23 040 

TACAAATGCA GTGACGCAAG CTGAACAAAT TTTAAATAAA GCACAAGGTC CAAATACTTC 23100 

AAAAGACGGT GTCGAAACTG CGTTAGAaAA TGTACAACGT GCTAAAAACG AATTGAACGG 23 ISO 

TAATCAAAAT GTTGCGAACG CTAAGACAAC TGCGAAAAAT GCATTGAATA ACCTAACATC 23220 

AATTAATAAT GCACAAAAAG AAGCATTGAA ATCACAAATT GAAGGTGCG A CAACAGTTGC 23280 

AGGTGTAAAT CAAGTGTCTA CAACGGCATC TGAATTAAAT ACAGCAATGA GCAACTTACA 23340 

AAATGGTATT AATGATGAAG CAGCTACAAA AGCAGCGCTT AATGGTACTC , AAAACCTTGA 23400 

AAAAGCTAAA CAACACGCAA ATACAGCAAT TGACGGTTTA AGCCATTTAA CAAATGCACA 23460 

2Q AAAAGAGGCA TTAAAACAAT TGGTACAACA ATCGACTACT GTTGCAGAAG CACAAGGTAA " 23520 

TGAGCAAAAA GCAAACAATG TTGATGCAGC AATGGACAAA TTACGTCAAA GTATTGCAGA 23580 

TAATGCGACA ACAAAACAAA ACCAAAATTA TACTGATGCA AGTCAGAATA AAAAGGATGC 23S40 

25 GTACAATAAT GCTGTCACAA CTGCACAAGG TATTATTGAT CAAACTACAA GTCCAACTTT 23700 

AGATCCGACT GTTAT CAATC AAGCTGCTGG ACAAGTAAGC ACAACTAAAA ATGCATTAAA 23760 

TGGTAATGAA AACCTAGAGG CAG CGAAACA ACAAGCGTCA CAATCATTAG GTTCATTAGA 23820 

TAACTTAAAT AATGCGCAAA AACAAACAGT TACTGATCAA^ATTAATGGCG CGCATACTGT "23880 

TGATGAAGCA AATCAAATTA AGCAAAATGC GCAAAACTTA AATACAGCGA TGGGTAACTT 23940 

GAAACAAGCG ' ATAGcTGACA AAGATGCTAC GAAAGCGACA GTTAACTTCA CTGATGCAGA 24 000 

TCAAGCAAAA CAACAAGCAT ATAACaCTGC TGTTACAAAT GCTGAAAATA TCATTTCAAA 24 060 

AGCTAATGGC GGCAATGCAA CACAAGCTGA AGTTGAACAA G CAATCAAAC AAGTTAATGC 24120 

TGCAAAACAA GCATTAAATG GTAATGCCAA CGTTCAACAT GCAAAAGACG AAGCAACAGC 24180 

ATTAATTAAT AGCTCTAATG ACCTTAACCA AGCACAAAAA GACGCATTAA AACAACAAGT 24240 

TCAAAATGCA ACTACTGTAG CTGGTGTAAA CAATGTTAAA CAAACAGCAC AAGAGTTAAA 243 00 

45 CAATGCTATG ACACAATTAA AACAAGGCAT TGCAGATAAA GAACAAACAA AAGCTGATGG 24360 

TAACTTTGTC AATGCAGATC CTGATAAGCA AAATGCATAT AATCAAGCAG TAGCGAAAGC 244 20 

TGAAGCATTA ATTAGTGc t A CGCCTGATGT TGTCGTTACA CCTAGCGAAA TTACTGCAGG 244 80 

SO GTTAAATAAA GTTACGCAAG CTAAAAATGA TTTAAATGGT AATACAAACT TAGCAACGGC 24540 

GAAACAAAAT GTTCAACATG CTATTGATCA ATTGCCAAAC TTAAACCAAG CGCAACGTGA 24600 
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AGCGGCGACA ACGCTTAATG 


ACGCGATGAC 


ACAATTGAAA 


CAAGGTATTG 


CGAATAAAGC 


24720 




ACAAATTAAA 


GGTAGCGAGA 


ACTATCACGA 


TGCTGATACT 


GACAAGCAAA 


CAGCATATGA 


24780 


5 


TAATGCAGTA ACAAAAGCAG AAGAATTGTT 


AAAACAAACA 


ACAAATCCAA 


CAATGGATCC 


24840 




AAATACAATT 


CAACAAGCAT 


TAACTAAAGT 


GAATGACACA 


AATCAAGCAC 


TTAACGGTAA 


24900 




TCAAAAATTA 


GCTGATGCCA AACAAGATGC 


TAAGACAACA 


CTTGGTACAC 


TAGATCATTT 


24960 


10 


AAATGATGCT 


CAAAAACAAG 


CGCTAACAAC 


TCAAGTTGAA 


CAAGCACCAG 


ATATTGCAAC 


25020 




AGTTAATAAT 


GTTAAGCAAA 


ATGCTCAAAA 


TCTGAATAAT 


GCTATGAOTA 


ACTTAAACAA 


25080 


1S 


TGCATTACAA 


GATAAAACTG 


AGACATTAAA 


TAGCATTAAG 


TTTACTGATG 


CAGATCAAGC 


25140 


TAAGAAAGAT 


GCTTATACTA 


ATGCGGTTTC 


ACATGCAGAA 


GGTATTTTAT 


CTAAAGCAAA 


25200 




TGGCAGCAAT 


GCAAGTCAAA 


CTGAAGTGGA 


ACAAGCGATG 


CAACGTGTGA 


ACGAAGCGAA 


25260 


20 


ACAAGCATTG 


AATGGTAATG 


ACAATGTAGA 


ACGTGCAAAA 


GATGCAGCGA 


AACAAGTGAT 


25320 




TAGAAATGCA 


AATGATTTAA 


ATCAAGCAAT 


GACACAATTG 


AAACAAGGTA 


TTGCAGATAA 


25380 




AGACCAAACT 


AAAGCAAATG 


GTAACTTTGT 


CAATGCTGAt 


ACTGATAAGC 


AAAATGCTTA 


25440 


25 


CAACAATGCG 


GTAGCACATG 


CTGAACAAAT 


AATTAGTGGT 


ACACCAAATG 


CAAACGTGGA 


25500 




TCCAGAACAA 


GTGGCTCAAG 


CGTTACAACA 


AGTGAATCaA 


GCTAAGGGTG 


ATTTAAACGG 


25560 




TAACCATAAC 


TTACAAGTTG 


CTAAAGACAA 


TGCAAATACA 


GCCATTGATC 


AGTTACCAAA 


25620 


30 


CTTAAATCAA 


CCACAAAAAA 


CAGCATTAAA 


AGACCAAGTG 


TCGCATGCAG 


AACTTGTTAC 


25680 




AGGTGTTAAT 


GCTATTAAGC 


AAAATGCTGA 


TGCGTTAAAT 


AATGcAATGG 


GTACATTGAA 


25740 




ACAACAAATT 


CAAGCGAACA 


GTCAAGTACC 


ACAGTCAGTT 


GACTTTACAC 


AAGCGGATCA 


25800 


35 


AGACAAACAA 


CAAGCATATA 


ACAATG CGGC 


TAACCAAGCG 


CAACAAATCG 


GAAATGGCAT 


25 860 




ACCAACACCT GTATTGACGC 


CTGATACAGT 


AACACAAGCA 


GTGACAACTA 


TGAATCAAGC 


25920 


40 


GAAAGATGCA 


TTAAACGGTG 


ATGAAAAATT 


AGCACAAGCG 


AAACAAGAAG 


CTTTAGCAAA 


25980 


TCTTGATACG 


TTACGCGATT 


TAAATCAACC 


ACAACGTGAT 


GCATTACGTA 


ACCAAATCAA 


26040 




TCAAGCACAA 


GCGTTAGCTA 


CAGTTGAAGA 


AACTAAACAA 


AATGCACAAA 


ATGTGAATAC 


26100 


45 


aGCaATGAGT 


AACTTGAAAC 


aAGGTATTGC 


aAACAAAGAT 


ACTGTCAAAG 


CAAGTGAGAA 


26160 




CTATCATGAT 


GCTGATG CCG 


ATAAGCAAAC 


AGGATATACA 


AATGCAGTGT 


CTCAAGCGGA 


26220 




AGGTATTATC 


AATCAAACGA 


CAAATCCAAC 


GCTTAACCCA 


GATGAAATAA 


CACGTGCATT 


26280 


SO 


AACTCAAGTG 


ACTGATGCTA 


AAAATGGCTT 


AAACGGTGAA 


GCTAAATTGG 


CAACTGAAAA 


26340 




GCAAAATGCT 


AAAGATGCCG 


TAAGTGGGAT 


GAOGCATTTA 


AACGATGCTC 


AAAAACAAGC 


26400 
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AGCAACGAGC 


' CTAGATCAAG 


CAATGGATCA 


ATTATCACAA 


GCTATTAATG 


ATAAAGCTCA 


26520 




AACATTAGCG 


GACGGTAATT 


ACTTAAATGC 


AGATCCTGAC 


AAACAAAATG 


CGTATAAACA 


26580 


5 


GGCAGTAGCA 


AAAGCTGAAG 


CATTATTGAA 


TAAACAAAGT 


GGTACTAATG 


AAGTACAAGC 


26640 




ACAAGTTGAA AGCATCACTA ATGAAGTGAA CGCAGCGAAA CAAGCATTAA 


ATGGTAATGA 


26700 




CAATTTGGCA 


AATGCAAAAC 


AACAAGCAAA 


ACAACAATTG 


GCGAACTTAA 


CACACTTAAA 


26760 


10 


TGATGCACAA AAACAATCAT TTGAAAGTCA AATTACACAA GCG CCACTTG 


TTACAGATGT 


26820 




CACTACGATT 


AATCAAAAAG 


CACAAACGTT 


AGATCATGCG 


ATGGAATTAT 


TAAGAAATAG 


26880 


1S 


TGTTGCGGAT 


AATCAAACGA 


CATTAGCGTC 


TGAAGATTAT 


CATGATGCAA 


CTGCGCAAAG 


26940 


ACAAAATGAC 


TATAACCAAG 


CTGTAACAGC 


TGCTAATAAT ATAATT AAT C 


AAACTACATC 


27000 




GCCTACGATG 


AATCCAGATG 


ATGTTAATGG 


TGCAACGACA 


CAAGTGAATA 


ATACGAAAGT 


27060 


20 


TGCATTAGAT 


GGTGATGAAA 


ACCTTGCAGC 


AGCTAAACAA 


CAAGCAAACA 


ACAGACTTGA 


27120 




TCAATTAGAT 


CATTTGAATA 


ATGCGCAAAA 


GCAACAGTTA 


CAATCACAAA 


TTACGCAATC 


271B0 




ATCTGATATT 


GCTGCAGTTA 


ATGGTCACAA 


ACAAACAGCA 


GAATCTTTAA 


ATACTGCGAT 


27240 


25 


GGGTAACTTA 


ATTAATGCGA 


TTGCAGATCA 


TCAAGCCGTT 


GAACAACGTG 


GTAACTTCAT 


27300 




CAATGCTGAT 


ACTGATAAAC 


AAACTGCTTA 


TAATACAGCG 


GTAAATGAAG 


CAGCAGCAAT 


27360 




GATTAACAAA 


CAAACTGGTC 


AAAATGCGAA 


CCAAACAGAA 


GTAGAACAAG 


CTATTACTAA 


27420 


30 


AGTTCAAACA 


ACACTTCAAG 


CGTTAAATGG 


AGAC CAT AAT 


TTAGAAGTTG 


CTAAAACAAA- 


27480 




TGCGACGCAA 


GCAATTGATG 


CTTTAACAAG 


CTTAAATGAT 


CCTCAAAAAA 


CAGCATTAAA 


27540 




AGACCAAGTT 


ACAGCTGCAA 


CTTTAGTAAC TGCAGTTCAT CAAATTGAAC AAAATGCGAA 


27600 


35 


TACGCTTAAC 


CAAGCAATGC 


ATGGTTTAAG 


ACAGAGCATT - 


CAAGATAAGG 


GAGCAACTAA • 


.r 27660 




AGCAAATAGC 


AAATATATCA 


ACGAAGATCA 


ACCAGAGCAA 


CAAAACTATG 


ATCAAGCTGT 


27720 


40 


TCAAGCCGCA 


AATAATATTA 


TCAATGAACA 


AACTGCAACA 


TTAGATAATA 


ATGCGATTAA 


27780 


TCAAGCAGCG 


ACAACTGTGA 


ATACAACGAA 


AGCAGCATTA 


CATGGTGATG 


TGAAGTTACA 


27840 




AAATGATAAA 


GATCATGCTA 


AGCAAACGGT 


TAGTCAATTA 


GCACATCTAA 


ACAATGCACA 


27900 


45 


AAAACATATG 


GAAGATACGT 


TAATTGATAG 


TGAAACAACT 


AGAACAGCAG 


TTAAGCAAGA 


27960 




TTTGACTGAA 


GCACAAG CAT 


TAGATCAACT 


TATGGATGCA 


TTACAACAAA 


GTATTGCTGA 


28020 




CAAAGATGCA 


ACACGTGCGA 


GCAGTGCATA 


TGTCAATGCA 


GAACCGAATA 


AAAAACAATC 


28080 


SO 


CTATGATGAA 


GCAGTTCAAA 


ATGCTGAGTC 


TATCATTGCA 


GGATTAAATA 


ATCCAACTAT 


28140 




CAATAAAGGT 


AATGTATCAA 


GTGCGACTCA 


AGCAGTAATA 


TCATCTAAAA 


ATGCATTAGA 


28200 



55 



467 



EP0 786 519 A2 



10 



15 



20 



25 



30 



35 



45 



50 



TCAATTAACA CCAGCTCAAC AACAAGCGCT AGAAAATCAA ATTAATAATG CAACAACTCG 2B320 

TGATAAAGTG GCTGAAATCA TTGCACAAGC GCAAgCATtA AATGAAGCGA TGAAAGCATT 28380 

AAAAGAAAGT ATTAAGGATC AACCACAAAC TGAAGCAAGT AGTAAATTTA TTAACGAGGA 2 8440 

TCAAGCGCAA AAAGATGCTT ATACGCAAGC AGTACAACAC GCGAAAGATT TGATTAACAA 2 8500 

AACAACTGAT CCTACATTAG CTAAATCAAT CATTGATCAA GCGACACAGG CAGTGACAGA 28560 

TGCTAAAAAC AATTTACATG GTGATCAAAA ACTAGCTCAA GATAAGCAAC GTGCAACAGA 28620 

AACGTTAAAT AACTTGTCTA ACTTGAATAC ACCACAACGT CAAGCACTTG AAAATCAAAT 28680 

TAATAATGCA GCAACTCGTG GCGAAGTAGC ACAAAAATTA ACTGAAGCAC AAGCACTTAA 28740 

CCAAGCAATG GAAGCTTTAC GTAATAG CAT TCAAGATCAA CAGCAAACGG AAGCGGGTAG 28800 

CAAGTTTATC AATGAAGATA AaCCaCmAAA AGrTGCTTAC CAAGCAGCAG TTCAAAATGC 28860 

AAAAGATTTA ATTAATCAAA CTAACAATCC AACGCTTGAT AAAGCACAAG TTGAACAATT 28920 

GACACAAGCT GTTAACCAAG CTAAAGATAA CCTACACGGT GATCAAAAAC TTGCAGACGA 28980 

TAAACAACAt GCGGTTACTG ATTTAAATCA ATTAAATGGT TTGAATAATC CGCAACGTCA 29040 

AGCACTTGAA AGCCAAATAA ACAACGCAGC AACTCGTGGC GAAGTAGCAC AAAAATTAGC 29100 

TGAAGCAAAA GCGCTTGATC AAGCAATGCA AG CATTACGT AATAGTATTC AAGATCAACA 29160 

ACAAACAGAA TCTGGTAGCA AGTTTATCAA TGAAGATAAA CCGCAAAAAG ATGCTTACCA 29220 

AGCAGGAGTT CAAAATGCAA AAGATTTAAT TAACCAAACA GGTAATCCAA CACTCGACAA 29280 

ATCACAAGTA GAACAATTGA CACAAGCAGT AACAACTGCA AAAGATAATC TACATGGTGA 29340 

TCAAAAACTT GCTCGTGATC AACAACAAGC AGTAACAACT GTAAATGCAT TGCCAAACTT 2 9400 

AAATCATGCA CAACAACAAG CATTAACTGA TGCTATAAAT GCAGCGCCTA CAAGAACAGA 29460 

GGTTSCACAA CATGTTCAAA CTGCTACTGA ACTTGATCAC GCGATGGAAA CATTGAAAAA 29520 

TAAAGTTGAT CAAGTGAATA CAGATAAGGC TCAACCAAAT TACACTGAAG CGTCAACTGA 29530 

TAAAAAAGAA GCAGTAGATC AAGCGTTACA AGCTGCAGAA AGCATTACAG ATCCAACTAA 29640 

TGGTTCAAAT GCGAATAAAG AOGCTGTAGA CCAAGTATTA ACTAAGCTTC AAGAAAAAGA 29700 

AAATGAGTTA AATGGTAATG AGAGAGTCGC TGAAGCTAAA ACACAAGCGA AACAAACTAT 29760 

TGACCAATTA ACACATTTAA ATGCTGATCA AATTGCAACT GCTAAACAAA ACATTGATCA 29820 

AGCGACGAAA CTTCAACCAA TTGCTGAATT AGTAGATCAA GCAACGCAAT TGAATCAATC 29880 

TATGGATCAA TTACAACAAG CAGTTAATGA ACATGCTAAC GTTGAGCAAA CTGTAGATTA 29940 

CACACAAGCA GATTCAGATA AACAAAATGC TTATAAACAA GCTATTGCTG ATGCTGAAAA 30000 
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TGCAAAACAA GCATTAAATG GTGATGAACG TGTAG CACTT GCTAAAACAA ATGGTAAACA 3012 0 
TGACATCGAC CAATTGAATG CATTAAACAA TGCTCAACAA GATGGATTTA AAGGTCGCAT. 30180 
5 CGATCAATCA AACGATTTAA ATCAAATCCA ACAAATTGTA GATGAGGCTA AGGCACXTAA . 3024 0 

TCGTGCAATG GATCAATTGT CACAAGAAAT CACTGACAAT GAAGGACGCA CGAAAGGTAG 30300 
CACGAACfAT GTCAATGCAG ATACACAAGT CAAACAAGTA TATGATGAAA CGGTTGATAA 30360 

10 

AGCGAAACAA GCACTTGATA AATCGACTGG TCAAAACTTA ACTGCAAAAC AAGTTATCAA - 30420 
ATTAAATGAT GCAGTCACTG CAGCTAAGAA AGCATTAAAT GGTGAAGAAA GACTTAATAA 30480 
TCGTAAAGCT GAAGCATTAC AAAGATTGGA TCAATTAACA CATCTAAACA ATGCTCAAAG ; 30540 

1S 

ACAATTAGCA ATCCAACAAA TTAATAATGC TGAAACGCTA AATAAAGCAT CTCGAG CAAT . 30600 
TAATAG AG CA ACTAAATTAG ATAATGCAAT GGGTTCAGTA CAACAATATA TTGACGAACA 3 0660 

20 GCACCTTGGT GTTATCAGCA GCACAAATTA CATCAATGCA • GATGACAATT TGAAAGCAAA - 30720 

TTATGATAAT GCAATTGCGA ATGCAGCACA TGAGTTAGAT AAAGTGCAAG . GTAATGCAAT 30780 
TGCaAAAGCT GAAGCAGAGC AATTGAAACA AAATATTATC GATGCTCAAA ATGCATTAAA 30840 

25 TGGAGACCAA AACCTTGCAA ATGCCAAAGA TAAAGCAAAT G CGTTTGTT A ATTCGTTAAA 30900 

TGGATTAAAT CAACAGCAAC AAGATCTTGC ACATAAAGCA ATTAACAATG CCGATACTGT 30960 
ATCAGATGTA ACAGATATTG TTAATAATCA . AATTGACTTA- AATGATGCAA TGGAAACATT . 31020 

30 GAAACATTTA GTTGACAATG AAATTCCAAA TGCAGAGCAA- ACTGTCAATT . AGCAAAACGC - - ,.310 80 

TGACGATAAT GCTAAA 31096 
(2) INFORMATION FOR SEQ ID NO: 60: 

35 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2243 base pairs 

(B) TYPE: nucleic acid 
<C> STRANDEDNESS : double 
(D) TOPOLOGY: linear- 

40 



Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 60:. 

45 ATGACAGAAT GGGAGCGAGG ACTTAGAATG TTTCCTAAAT CAGGTTTATT AAATTTTGAG .. 60 

TTAGCGATAG. mAAATCGTTC ATTAAATGAT GATGAAAAAG CATTAAAATA TGTGCGTAAA 120 

GCATTAAATG CAGACCCTAA AAATACAGAT T AT ATT AAC T TAGAAAAAGA GTTGACTAAA 1B0 

SO TCAAATGAGT CGAAAAATAA ATAACTTTTA TGATGTACAA CAGTTATTGA AAAGTTACGG 24 0 

ATTTCTAATA TATTTTAAAA ATCCAGAAGA TATGTACGAA ATGATTCAAC AGGAGATTTC 300 
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TAATCAGAGA AGGAATGAAC AGAAATGACA AAAATTATTT TAGCAGCTGA TGTAGGCGGG 420 

ACGACTTGTA AATTAGGTAT TTTCACACCT GAATTAGAAC AATTACATAA ATGGTCTATT 480 

CACACTGATA CATCTGATAG TACAGGATAT ACACTTTTGA AAGGAATTTA TGATTCGTTT 540 

GTTGAAAAAG TAAATGAAAA TAATTATAAT TTTTCAAATG TACTTGGCGT AGGTATTGGT 600 

GTACCAGGTC CTGTTGACTT TGAAAAAGGT ACAGTAAATG GAGCAGTAAA CTTATATTGG 660 

CCAGAAAAAG TTAATGTACG TGAGATTTTT GAACAATTCG TTGATTGTCC AGTGTATGTA 720 

GATAATGATG CTAACATAGC TGCTTTAGGG GaGAAACACA AAGGTGCTGG TGAAGGTGCC 780 

GATGATGTTG TTGCCATCAC ACTTGGTACA GGTCTAGGTG GAGGAATTAT TTCCAAATGG 84 0 

TGAAATCGTA CATGGTCATA ATGGCTC t GG CGCAGAAATA GGTCATTTTA GAgCAGACTT 900 

CgATCAACGA TTTaAATGTA ATTGTGGTCG TTCTGGATGT ATTGAAACAG TTGCTTCaGC 960 

GACAGGCGTT GTTAACTTAG TTAACTTCtA CTATCCGAAG TTGACGTTTA GATCTTCTAT 1020 

ATTAGAATTG ATTAAAGAAA ATAAGGTtAC aGCAAAAGCT GTTTTTGATG CGGCAAAAGC 1080 

TGGTGACCAA TTCTGTATTT TCATTACTGA AAAGGTTGCA AACTATATTG GATATTTATG 114 0 

TAGTATTATT AGTGTTACAA GTAATCCGAA ATATATCGTT CTAGGTGGAG GAATGTCTAC 1200 

TGCAGGACCT ATTTTAATTG AAAATATTAA AACAGAATAT CATAATTTAA CATTTGCACC 1260 

TGCTCAATTT GAAACTGAAA TTGTACAAGC GAAATTAGGT AATGATG CAG GTATTACAGG 132 0 

AGCAGCAGGA TTAATCAAGA CCTATGTATT AGATAAAGAG GGGGTAAAAT AATGGCTATT 13 80 

GTTGATGTGG TTGTTATTCC AGTTGGAACG GAAGGTCCGA GTGTTAGTAA ATATATTGCA 144 0 
GAT ATT CAGA AAAAACTTCA AGAATATAAA GCAATGGGTA AAATTGATTT TCAATTAACA - 1500 

CCAATGAATA CTCTAATTGA AGGTGAATTA AGCGATGTAT TAGAAGTTGT GCAAGTGATA 1560 

CATG^ATTAC CTT TT GATAA AGGTTTAAGT AGAGTTTGTA CAAATATCCG TATTGATGAC 1620 

CGACGAGACA AATCTAGAAA AATGAATGAT AAACTAACAT CAGTACAAAA ACATTTAGAA 1680 

AATAGTGGTG AAAACCTATG AGGATTTCAA GCTTAACTTT AGGCTTAGTT GATACTAATA 1740 

CGTATTTCAT CGAAAATGAC AAAGCTGTTA TTCTGATTGA CCCTTCAGGT GAAAGTGAAA 1800 

45 AAATTATTAA AAAATTAAAC CAAATAAATA AACCGTTAAA AGCTATTTTA TTAACACATG 1860 

CACACTTTGA TCATATCGGA GCAGTCGATG ATATAGTTGA TCGATTCGAT GTCCCGGTTT 1920 

ATATGCATGA AG CAG AGTTT GATTTTCTAA AAGATCCCGT TAAAAATGGG GCAGATAAAT 1980 

50 TTAAGCAATA TGGATTACCA ATTATTACAA GTAAGGTAAC TCCTGAAAAG TTAAmCGAAG 204 0 

GTAGCACAGA AATAGAAGGA TTTAAGTTnT nAyrTGTaCA CACAC CTGG A CATTCACCAG 2100 
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GAATCGGACG TACAGATTTA TATAAAGGTG ATTATGAAAC GCTAGTTGAT TCTATTCAAG 222 0 

ATAAAATATT TGAATTAGAA GGC 2243 
(2) INFORMATION FOR SEQ ID NO: 61: t 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8009 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 

15 





TTGGnATCAT 


tyAcgGTAAA AAGAATAAaG CAAGATTtAT TTCATTAGTA CTAATTTGTG 


en 
O KJ 




CAATGTTTGC 


AATTTGTTGG 


GTTGCATATA TTCAATGGGA 


GTCTACAATC 


GCTTCATTTA 


1 or* 


20 


CACAAT CTAT 


TAATATTTCa 


ATGG CACAAT ATAGTGTTTT 


ATGGACAATT 


AACGGAATAA 


XO\f 




TGATTTTAGT 


AG CACAACCA 


TTAATTAAAC CGATTCTCTA 


TCTGTTAAAA 


GGAAACTTAA 


OA ft 




AGAAGCAAAT 


GTTTGTCGGC 


ATCATCATTT TTATGTTGT C 


GTTCTTTGTC 


ACGAGTTTTG 


300 


25 


CCGAAAACTT 


TACAATATTT 


GTTGTCGGTA TGATTATTTT 


AACTTTTGGA 


GAAATGTTTG 


360 




TATGGCCAGC 


AGTTCCAACT 


ATAGCCAATC AGTTAGCGCC 


AGATGGTAAG 


CAAGGACAGT 


420 




ACCAAGGTTT 


TGTGAATTCA 


GCTGCTACAG TAGGAAAAGC 


ATTTGGTCCA 


TTTCTTGGTG 


480 


30 


GTGTATTAGT • TG ATGCGTTT 


AATATG CGCA TGATGTTTAT 


CGGTATGATG.VCTACTACTTG ' 


540 




TATTTGCATT 


AATATTATTA 


ATGGTTTTCA AGGAGAATAA 


TACGCAACCT 


AAAAAAATAG 


600 




ATGCATAATG 


AGTAAATAGA 


ATTAACGTTA TAGACTTGAA ' ATAAATGTCG 


TTATAACATA 


660 


35 


ATATTAATTT 


GTATAATTTA 


ATTTCGTTTG GAGCTTTTCT ACAGAAAGCT AGTGATGCTG. 


" 720 




AGAGCTAGTG 


TTAAGGACTA 


AATGTAAATC GTATTAATTT TAAATTGAAT 


GAATGACATC 


780 


40 


TCTTACTATT 


AAAATGAGTG 


CACAATTTTT GTGAAATAGG 


GTGGTAACGC 


GGCAAATGTC 


840 


GTCCCTATGT 


AAATAGAATA 


GTTAGAGGTG TCTTTTTTAT 


TGAATAGGAG 


GAAATGTGTT 


900 




GAATTACAAC 


CACAATCAAA 


TTGAAAAGAA ATGGcAAGAC 


TATTGGGACG 


AAAATAAAAC 


960 


45 


ATTTAAAACA 


AATGATAACT 


TAGGTCAAAA GAAATTTTAT GCTTTAGACA 


TGTTTCCATA 


1020 




TCCATCAGGT 


GCTGGTTTAC 


ATGTTGGACA TCCTGAGGGc 


TATACAGCAA 


CAGATATCAT 


1080 




TTCAAGATAT 


AAAAGAATGC 


AAGGATATAA TGTATTACAT 


CCGATGGGGT 


GGGATGCATT 


1140 


SO 


CGGATTACCA 


GCAGAGCAAT 


ATGCTTTAGA CACTGGCAAC 


GACCCACGTG 


AATTTACAAA 


1200 




GAAAAATATC 


CAAACTTTTA 


AACGACAAAT TAAAGAATTA 


GGGTTCAGTT 


ATGATTGGGA 


1260 
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GTTATATAAC AAAGGTTTAG CATACGTTGA TGAAGTTGCA GTTAACTGGT GTCCAGCATT 138 0 

AGGCACTGTT TTATCTAACG AAGAAGTGAT TGATGGTGTC TCTGAACGTG GTGGACATCC 1440' 

5 AGTTTATCGT AAGCCGATGA AACAATGGGT ACTTAAAATC ACAGAATATG CAGATCAATT 1500 

ATTAGCAGAT TTAGATGATT TAG ATTGGC C TGAGTCTTTA AAAGATATGC AGCG CAATTG 1560 

GATTGGACGT TCTGAAGGGG CCAAAGTTTC ATTTGATGTA GATAATACGG AAGGAAAAGT 1620 

10 

AGAAGTATTT ACGACTAGAC CAGATACAAT CTATGGTGCA TCATTCTTAG TCTTAAGTCC 16 80 

TGAACATGCA TTAGTTAATT CAATTACAAC AGATGAATAT AAAGAAAAAG TAAAAGCTTA 1740 

TCAAACAGAA GCTTCTAAAA AGTCAGATTT AGAACGTACA GATTTAGCAA AAGATAAATC 1800 

15 

AGGTGTATTT ACTGGTGCAT ATGCAACTAA TCCTTTATCT GGTGAAAAAG TACAAATTTG i860 

GATTGCTGAT TATGTATTAT CAACATATGG TACTGGAGCA ATTATGGCAG TACCAGCGCA 1920 

2Q TGATGACAGA GATTATGAAT TTGCTAAAAA GTTTGATTTG CCAATCATTG AAGTCATCGA 1980 

AGGTGGAAAT GTTGAAGAAG CAGCATACAC TGGTGAAGGT AAAGATATTA ATTCTGGTGA 2040 

ACTTGATGGT TTAGAAAATG AAGCGGCAAT TACTAAAGCT ATTCAATTAT TAGAGCAAAA 2100 

25 AGGTGCTGGC GAAAAGAAAG TTAATTACAA ATTAAGAGAT TGGTTATTCA GTCGTCAGCG 2160 

TTATTGGGGC GAACCAATTC CTGTCATTCA TTGGGAAGAT GGAACAATGA CAACTGTTCC 2220 

TGAAGAAGAG CTACCATTGT TGTTACCTGA AACAGATGAA ATCAAGCCAT CAGGGACTGG 22 80 

30 TGAGTCTCCA CTAGCTAATA TTGATTCATT TGTAAATGTT GTAGATGAAA AAACAGGTAT 2340 

. GAAAGGACGT CGTGAAACAA ATACAATGCC ACAATGGGCA GGTAGTTGTT GGTATTATTT 24 00 

ACGTTAGATC GATCCTAAAA ATGAAAATAT GTTAGCAGAT CCTGAAAAAT TAAAACATTG 24 60 

35 

GTTACCTGT7T GATTTATATA TCGGTGGAGT AGAACATGCG GTTCTTCACT TATTATATGC 2520 

AAGATTTTGG CAT AAAGTC C TTTATGATTT GGCTATCGTA CCTAGTAAAG AACCTTTCCA 25 80 

AAAATTATTT AACCAAGGTA TGATTTTAGG AGAAGGTAAT GAGAAGATGA GTAAATCTAA 2640 

40 

AGGAAATGTA ATCAATCCTG ATGATATAGT ACAGTCTCAT GGTGCAGATA CTTTGCGTCT 2700 

TTACGAAATG TTTATGGGAC CTTTAGATGC TGCAATTGCA TGGAGTGAAA AAGGATTAGA 2760 

4S TGGGTCTCGT CGATTCTTAG ATCGCGTATG GGGTTTAATG GTAAATGAAG ATGGGACATT 2820 

GAGTTCAAAA ATTGTAACTA CAAATAATAA ATCTTTAGAT AAAGTTTATA AGCAAACTGT 2880 

TAAAAAGGTA ACAGAAGACT TTGAAACATT AGGATTTAAT ACTGCTATTA GTCAATTAAT 2 940 

50 GGTATTTATT AATGAGTGTT ATAAAGTTGA TGAAGTTTAT AAACCTTACA TTGAAGGCTT 3000 

CGTTAAAATG TTAGCACCTA TTGCACCACA TATCGGTGAA GAATTATGGT CAAAATTAGG 3060 
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TGATGAAGTA GAAATCGTTG TTCAAGTGAA TGGTAAATTG AGAGCTAAAA TTAAAATTGC 3180 

TAAAGATACA TCAAAAGAAG AAATGCAAGA AATTGCCTTA TCTAATGACA ATGTTAAAGC 324 0 
GAGTATTGAA GGTAAAGACA TCATGAAAGT CATCGCTGTT CCTCAAAAAT TAGTCAATAT . 3300 

TGTAG CTAAA TAATGTTTTA AGGAGGACTT TGAAATGAAG TCAATTACTA CAGATGAATT 3 360 

AAAAAATAAA CTTTTAGAAT CTAAACCAGT TCAAATTGTT GATGTTCGTA CTGATGAAGA 3420 

AACAGCAATG GGATATATTC CTAATGCAAA GTTAATTCCA ATGGATACCA TTCCGGATAA 3460 

TTTAAATTGA TTTAATAAAA ATGAAATATA TTATATTGTA TGTGCTGGTG GAGTTCGAAG . 354 0 

CGCTAAAGTT GTAGAATATT TAGAGGCAAA TGGCATTGAT GCCGTAAATG TCGAAGGCGG .3600 

CATGCACGCA TGGGGCGATG AAGGTTTGGA AATAAAAAGT ATTTAAAGTA GTGACATAAT 3660 

TTAAAATAAT ATTACATTTG TAATGACACC AAGTAACGTT TCGGTTGCTT GGTGTTTTTT 3720 

GGTATGAATT ACTTTGTGTT ACAAAACAAT CTAAAGCGTT CTTGTTATGT TTTATTAAGA 3780 

TTTTAATTAC AAAACGGAAA CTAAATTGTA ATAAAATAAA ACTTTATTTT ATAAAATGAT 3840 

GATGATAAAA TTGAGTGAAC TTAAAATATT GTACAAAATA ATATAGCTAT AAATATAATA 3900. 

25 TAGCTATAAA TATAATATGA GGGAGCGTAT ATTTTTAGCA TAATTCTTAA CAACACAGCA ■ 3960- 

GAGAACAGAC AACCAGGAGG AAAATGAAAT GAATTTGTTA AAGAAAAATA AATATAGTAT 4 02 0 

TAGGAAGTAT. AAAGTAGGCA TATTCTCTAC .TTTAATCGGAr ACAGTTTTAT TACTTTCAAA 1 4080 
30 CCCAAATGGT GCACAAGCCT . TAACTACGGA TAATAATGTA . CAAAGCGATA ' CTAATCAAGC , '1 .4140 

AACACCTGTA AATTCACAAG ATAAAGATGT TGCTAATAAT - AGAGGTTTAG CAAATAGTGC 4200 

GCAGAATACA CCTAATCAAT CTGCAACAAC GAATCAAGCA ACGAATCAAG "CATTGGTTAA 4260 
TCATAATAAT GGTAGTATAG . TAAATCAAGC TACGCCAACA TCAGTGCAAT CAAGTACGCC '.. 4320 

TTCAGCACAA AACAATAATC ATACAGATGG CAATACAACA GCAACTGAGA CAGTGTCAAA 43 80 

CGCTAATAAT AATGATGTAG TGTCGAATAA TACCGCATTA AATGTAGCAA CTAAAACAAA .4440'. 

TGAAAATGGT TCAGGAGGAC ATCTAACTTT AAAGGAAATT CAAGAAGATG TTCGTCATTC 4500. 

TTCAAATAAA CCAGAGCTAG TTGCAATTGC TGAACCAGCA TCTAATAGAC CGAAAAAGAG 4 560 

4S AAGTAGACGT GCGGCACCGG CAGATGCTAA TGCAACTCCA GCAGATCCAG CGGCTGCAGC 4620 

GGTAGGAAAC GGTGGTGCAC CAGTTGCAAT. TACAGCGCCA TATACGCCAA CAACTGATCC 4 680 

TAATGCCAAT AATGCAGGAC AAAATGCACC TAACGAAGTG CTGTCATTTG ATGACAATGG 474 0 

50 TATTAGACCA AGTACCAACC GTTCTGTGCC AACAGTAAAC GTTGTTAATA ACTTGCCGGG 4 800 

CTTCACACTA ATCAATGGTG GCAAAGTAGG GGTGTTTAGT CATGCAATGG TAAGAACGAG 4 860 
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TCGTATACAT GGAACTGATA CGAATGACCA TGGCGATTTT AATGGTATCG AGAAAGCATT 4 980 

AACAGTAAAT CCGAATTCTG AATTAATCTT TGAATTTAAT ACAATGACTA CTAAAAACGG 5040 

^ TCAAGGCGCA ACAAATGTTA TTATCAAAAA TGCTGATACT AATGATACGA TTGCTGAAAA 5100 

GACTGTTGAA GGCGGTCCAA CTTTGCGTTT ATTTAAAGTA CCTGATAATG TGAGAAATCT 5160 

CAAAATTCAA TTTGTACCTA AAAATGACGC AATAACAGAT GCGCGTGGCA TTTATCAACT 5220 

10 AAAAGATGGT TACAAATACT ATAGCTTTGT TGACTCTATC GGACTTCATT CTGGGTCACA 5280 

TGTTTTTGTT GAAAGACGAA CAATGGATCC AACAGGAACA AATAATAAAG AGTTTACTGT 5340 

AACAACATCA TTAAAGAATA ATGGTAATTC TGGTGCTTCT CTAGATACAA ATGACTTTGT 5400 

15 

ATATCAAGTT CAATTACCTG AAGGTGTTGA ATATGTGAAC AATTCATTGA CTAAAGATTT 5460 

TCCAAGTAAC AATTCAGGCG TTGATGTTAA TGATATGAAT GTTACATATG ATGCAGCAAA 5520 

TCGTGTGATA ACAATTAAAA GTACTGGAGG AGGTACAGCA AACTCTCCGG CACGACTTAT 5580 

20 

GCCTGATAAA ATACTCGATT TAAGATATAA ATTACGTGTA AATAATGTGC CGACACCAAG 5640 

AACAGTAACA TTTAACGAGA CATTAACGTA TAAAACATAT ACACAAGATT TCATTAATTC 5700 

25 AGCTGCAGAA AGTCATACTG TAAGTACAAA TCCATATACT ATCGATATCA TCATGAATAA 5760 

AGATGCATTA CAAGCCGAAG TTGACAGACG TATTCAACAA GCTGATTATA CATTTGCGTC 5820 

ATTAGATATC TTTAATGGTC TGAAACGACG CGCACAAACG ATTTTAGATG AAAATCGTAA 5880 

30 CAATGTACCA TTAAATAAAA GAGTTTCTCA AGCATATATT GATTCATTAA CTAATCAAAT 5940 

GCAACATACG TTAATTCGAA GTGTTGATGC TGAAAATGCA GTTAATAAAA AAGTTGACCA 6000 

AATGGAAGAT TTAGTTAATC AAAATGATGA ATTGACAGAT GAAGAAAAAC AAGCAGCAAT 6060 

35 ACAAGTTATC GAGGAACATA AAAATGAAAT AATTGGTAAT ATTGGTGACC AAACGACTGA 6120 

TGATGGCGTT ACT AGAAT CA AAGATCAAGG TATACAGACC TTAAGTGGGG ATACTGCAAC 6180 

ACCGGTTGTT AAACCAAATG CTAAAAAAGC AATACGTGAT AAAGGAACGA AACAAAGGGA 6240 

40 

AATTATCAAT GCAACAC CAG ATGCTACTGA AGACGAGATT CAAGATGCAC TAAATCAATT 6300 

AGCTACGGAT GAAACAGATG CTATTGATAA TGTTACGAAT GGTACTACAA ATGCTGACGT 636 0 

TGAAACAGCT AAAAATAATG GCATCAATAC TATTGGAGCA GTTGTTCCTC AAGTAACTCA 6420 

45 

TAAAAAAGCT GCAAGAGATG CAATTAACCA AGCAACAGCA ACGAAAAGAC AAGAAATAAA 6480 

TAGTAATAGA GAAGCAACTC AGGAAGAGAA AAATGCAGCA TTGAACGAAT TAACTCAAGC 6540 

SO AACCAACCAT GCTTTAGAAC AAATCAATCA AGCAACAACA AATGCTAATG TTGATAACGC 6 600 

CAAAGGAGAT GGTC TAAATG CCATTAATCC AATTGCTCCT GTAACTGTTG TTAAGCAAGC • 6660 
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TGATGCGACT CAAGAAGAAA GACAAGCAGC AATTGACAAA GTGAATGCTG CTGTAACTGC ' 6780 

AGCAAACACA AACATTTTAA ACGCTAATAC CAATGCTGAT GTTGAACAAG TAAAGACAAA 6840 

TGCGATTGAA GGAATAGAAG CAATTACACC AGCTACAAAA GTAAAAACAG ATGGAAAAAA 6900 

. TGCCATCGAT AAAAGTGCGG AAACGCAACA TAATACGATA TTTAATAATA ATGATGCGAC 6960 

GCTCGAAGAA CAACAAGCAG CACAACAATT ACTTGATCAA GCTGTAGCCA CAGCGAAGCA 7020 

AAATATTAAT GCAGCAGATA CGAATCAAGA AGTTGCACAA GCAAAAGATC AGGGCACACA 7080 

AAATATAGTA GTGATTCAAC CGGCAACACA AGTTAAAACG GATACTCGCA ATGTTGTAAA 7140 

TGATAAAGCG CGAGAGGCGA TAACAAATAT CAATGCTACA ACTGGCGCGA CTCGAGAAGA 7200 

GAAACAAGAA GCGATAAATC GTGTCAATAC ACTTAAAAAT AGAGCATTAA CTGATATTGG 7260 

TGTGACGTCT ACTACTGCGA TGGTCAATAG TATTAGAGAC GATGCAGTCA ATCAAATCGG 7320 

CGCAGTTCAA CCGCATGTAA CGAAGAAACA AACTGGTACA GGTGTATTAA ATGATTTAGC 7380 

AACTGCTAAA AAGGAAGAAA TTAATCAAAA CACAAATGCA ACAACTGAAG AAAAGCAAGT 7440 

GGCTTTAAAT CAAGTGGATC AAGAGTTAGC AACGGCAATT AATtnATATAA ATCAAGCTGA 7500 

TACAAATGCG GAAGTAGATC AAGCGCAACA ATTAGGTACA AAAGCAATTA ATGCGATTCA 7560 

GCCAAATATT GTTAAAAAAC CTGCAGCATT AGCACAAATC AATCAGCATT ATAATGCTAA 7620 

ATT AG CTGAA ATCAATG CTA CACCAGATGC AACGAATGAT GAGAAAAATG CTGCGATCAA 7680 
TACTTTAAAT CAAGACAGAC AACAAGCTAT TGAAAGTATT AAAGAAGCTA ACACAAATGC . 7740 

AGAAGTAGAC CAAG CTGCGA CAGTAGCAGA GAATAATATC GATGCTGTTC AAGTTGATGT 7800 

AGTAAAAAAA CAAGCAGCGC GAGATAAAAT CACTG CTGAA GTGGcGAacG TATTGaAGCG 7860 
GTTAAACAAA CACCTAATGC AACTGACGAA GAAAAGCAGG CTGCTGTTAA TCAAATCCAA . .7920 

TCAACTTTAA AGATTCAAGC AATTTAATCC AAATTTAATC CAAAACCCAA ACAAATGGAT 7980 

TCAGGGTAGG ACACCACTTA CAAATCCAA 8009 
(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: . 

(A) LENGTH: 10953 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



SO (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 

ACCCACCCCn TGGGGATAnT TTACCTGGTG GGGCCTTCGA TTGCCTTTAG GTGAAACCaG 
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AGATGAATGC TAACCATATT CATTCTGCTA AAGATGGTCG TGTTACTGCG ACAGCTGAAA 180 

TTATTCATCG AGGTAAGTCG ACACATGTAT GGGATATAAA AATTAAGAAT GACAAAGAAC . 240 

5 AATTAATTAC AGTTATGCGT GGTACAGTTG CTATTAAACC TTTAAAATAA AAGAACTGCT 300 

AGCTGAAATG TTATGAGATA TTCATAACTA CGGCTAGCAG TTTTTTTATG CGCTATATTG 360 

TTGTAGTTTT AGAAATGCTT GTTCAATGCG TTCGGCAGCT TTAGGGCCAC CCATAAGATT 420 

10 

TCTACCAAAT GGTCCTAATT CTAAGTCTGC AAAGCATCCT GCGACAAATA GATTTGGTAT 480 

CCATTCTAAT TTTTCGGAAA TAAGAGGGTA ATTACATTCG TTGATAGGTG CATCATAATT 540 

TTGTATTAAT TGCTTAATAA GTGGTTGTGA CATAAAATCT TGTTCAAAAC CAGTTGCAAC 600 

15 

CATAATCTGT TGATATGGAA GAGAATCATT TTCAGTGTTA ATTACACCAC CACTAATTTG 660 

AGTGATAGGT GTTTTATGCa CATTTATACG ACCATTTTTA ATATGTTTTT TAAGGCGTAA 720 

2Q GTACAGTTCG TGAGGCATTG ATCCTTTATG ACGTTCGCGT TGTACAATGG CATTTCTTTC 780 

AGGCATGCTT TTAGTACTTA AAAATGAAGA CATATTTTTC GGACCTAACC AACCAGGATC 840 

AGCATCAAAG TCATGTATTT CAATATCTTT ATTTAGCCAT AAATGAATCT TTTTATCGTT 900 

25 ATCATGATTT AACAATTTAA GTGCAAGATG TGCAGCAGTa ATGCCGCTAC CAACGATATG 960 

ATCGGTCTTA TCATATACTA CTTGATCAAG TTCTTTCTCG AAGATATGAT TTACATTCTG 1020 

TTTGTCTTTT AAAATGT CAG GCATAAACGG AATATTTGTA CTGCCTATTG CAATAACGAC 1080 

30 GCAATCTGTA GTGATAATTT GTCCATCTTC TAACTTGATA TGCCATTTGT CTTCTTGTTT 114 0 

ATCTAAAGTT TGAACTAAAC CTTGAACCAA GCAATCCTCT AATTGATATT GTTTAGAAGC 1200 

ATGTGCAATA TGATCCATAA ACATTGTCAA TTCAGGTCGT TGATAAGGAC CATAAAAAGC 1260 

35 _. 

ATTTGTATAT TGGTGCTGTT TAGCGAATTG TTTTAGATGG AACGGTTGTG GATGTACGTG 1320 

ATGTACAATC GGTGATCTTA AATAAGGCAT TTCTATTCGA TTTGTATATG AGTTAAACCT 13 80 

TTGGCAAAAA GTTTCGTGTG GGTCAATGAT TGTTAATCGG TCTGTTGTTA ATCCG CTTGA 1440 

40 

TAATAGTTTT TGTGCGATTG CAGTTCCCTG TATGCCACCG CCGATAATTG TCCAATG CAT 1500 

AATAAAACCT CTCTCTTTTT AAAACGTAAT AGTTACGATT TATAATTATT ATTATCATAA 1560 

4S TACATAACGA CATGAAAGGC AATTAAATTA AAGAGATATA TGTAGATAGG GCGAATCTGT 1620 

AGTCAAAGAA AAAATCATTG AAAAAGAGGT AACAATGTCA AAAGAwAACA GCAGTAAAAT 1680 

CATTCCTAAT TTGGAATCAT CTTACTGCTG TTTGTTGTTG ATTTATATTC ATGATTTTGT 174 0 

50 TATATAATCT ACAATTTTGT GTCTTTTAAG TGTTCCGAAA TTTCATCGAC TTTAGTCTTT 1800 

TTAGTATAAG GGGTTTTAAT ATTATATGCT GCTTTCATAA TCATATGACT TGAAAGAGGA 1860 
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GCAATAAAAT ATAAAAACGT ACCAAATAGT 
CCGGCAGCAT GTGCACGTGA ATATACATCT 
5 . GCGCTAATTA AAGCACCGAT GATAACAAAG 
ATCATGTTCA ATCACCTTAC CTTTGTGCAT 
AGCTAATATA CCAATCATCA TAATAACGAC 

10 

GAATAATGCT ATAACTGCCA TTAATTGAAG 
GGCAAGTGAT GGGCCTAGCA CAACGCGAAT 
GATTAATGCA ATAACGATAA TAACATTATG 

15 

ATTTTCTCTA ATGATGTTTT AATACTTTCT 
GCATGAATAT AAATTTTTGT ACGATCGTCA 

2Q AATGTAATTA AATTAGACAG CAAG ACAATT 

ACAAAGAATC CTGGTTCATT TTTAATCGAA 
TTAGCTTTAA TCAGTTCGAT TAAGAAAATA 

25 ATGACATAAA ATCTACCTGG TAACACTCTG 

ATGAAACCTA ACACAAAGTT ATTTGTTGTG 
GCGATAATAA AGTTTAATAC, TAATTGTACA 

30 AACGTAGGTT GATGGATTGT AGAATGTTTC 

ATCTGCTGAC AATCCATATA AAACAGTTAT 
* ATATTTGACG TCGACTTTGT TATTAAGATC 

35 

TAGGAATATG CGAATGACAG AAT AT AAT AC 
ACTTAAATAA AATCCTCTTT CAAATGTTGA 
ACTGAGTGGG GGAATGCCAG CTAAACTTAA 

40 

AGGATATCGT TTAATTAAGC CACCAAATTG 
CATAATTC CG ATAAGCAAGA ATAATGCAAG 

4S AATAGCCCCA ATCATACCTG ACTCTGTGAT 

AGCAATCATG ACATTGTATA GGATGATTTT 
ACAACCAAAG ATGATCGTTA ATAGTGCTAA 

50 ATT AT CAC T A AAGAATAGGC TCAATGTTCT 
CAAAGCACCA AAGAATGCAA TGATTGGAAT 

55 



AATGACATTG CACCTAATGT TGATGCTTTT . 19 80 

TCAAGTCTCA ATAATCCTAT AGCTG CTAGG 2040 

ATAAGTGCAA GACTAATCAG TATGATTTTG 2100 

AAATTTAGAG AATACTGCAG TACCTAAAAA 2160' 

AATCATGTAT TTAATATTTA ATAAAATACT 2220 

ACCAATCGCA TCTAATGCGA CAACACGATC 2280 

GAGCATAGCT AACATAGAAA TGACAACTAT 2340 

ATTCATTATA TTTCGCCCAC CTCTCTTACA 2400 

ACTTCTTGCT CTTTAGTTGA AAAATCTATG 2460 

CTTACACCAA GCACTACAGT ACCAGGTGTT 2520 

TGCCAATCTT TTTTTAAATC TGTGTGATAA 2580 

GGTTTAATAA TAATTTTCAA AACATCAAAA 2640 

ATAACTAATT TAATAATACG ATATAGCGTG 2700 

TGTAAGAGGT AAACAAGAAC TAGGCCAAAG 2760 

TAACTATTTG TCACAAACAA CCAAAACACT 2820 
GCCATGTTAT TTACCTCCTA ATACAGCTTT / f * 2880 

TGCACCAGCT ' TTTACCATTG GATATAAGTA 2940" 

CACAACTGCA ACGATTGCAA TCGTAGTTAA 3000 

ATATCGTTTT' GGTTGACCGA AAAAGCCTTG 3 060 
GACTAAACTT " GATAATAAGA CGATGACACC * 3120 

TTGGACAATA AAAAATTTTC CATAAAAGCC 3180 

TGCTGCGATA AAGAATGACC AACCAAGTAC 324 0 

TCTTAAATCA GCAGTGCCTG TAATTTTAAT 3300 

TTTTACTAAC ATGTCGTGCA ATGTATAGTA 3360 

CATTGCAACG CCGACTAAGA TCACACCTAC 3420 

TTTAATGTTG GCATATGGAA GAGCACCGAC 34 SO 

GAATAAAATG ACATAATGTG AAAAGCTTAC 354 0 

AGCGATTGCA TAAACACCAA CTTTTGTTAA 3 600 

TGGTGGgCAT AGTATG CACT AGGTAACCAA 3660 
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ATATTGACTA AGCCACTGTC ATGCGCTGAA AGGTTAGCTA ATTTATTGCT TATATCTGCT 37 80 

AGATTCAATG TTCCTACTAC TGAATATAAA ATCGCTACAC CCATTACGAA GAAGGATGAC 384 0 

GATACAACGT TAACAAGAAC ATATTTTATT GTTT CTT GTA GTTGAATTTT TGTAGAACCA 390 0 

ATTACTAATA AGAAATAAGA TGACATTAAA AATACTTCGA AAAATACGAA TAGGTTGAAA 3960 

ATGTCACCAG TTGTGAATGC ACCAATGATA CCTATTAACA TAAATAGTAC TGAAAAATAA 4 020 

TAATAATATC TTTCACGTTC AATACCAATT GTTTGGTATG AATATAAAAT CACAATAGCT 4 080 

GTAATAATAA TACTAGTAAT TATTAGTAGG GCACTGAATA TGTCTAATAC AAAGACAATA 4140 

CTGTATGGTG CTTTCCATGA ACCTAGCTCT ACGCGTATTG GTCCATGTTT AACAACATTT 4200 

GCTAAATTGA TAATTGCCGC GACCAAGGTT AATAATGTAC CGCCTAGTGC GACATAACGC 4260 

TTTATAATAG GACGCTTTCC AATAAAGACA AGTAATATGG CTGTAATTAC TGGAATAACT 4320 

AGCGTTAACA CAAGCATATT ACTTTCAATC ATCTTCTGGA ACTCCTTTCA TACTCTCAAC 4380 

GTTATCTGTG CCTAATTCTT TATATGTTCT AAATG CT AAT ACTAAGAAAA AGGCTGTTGT 444 0 

CGCAAgGCGA TAACGATTGC TGTTAAAATA AGTGCTTGCG GGaTAGGaTC AACATAGCTT 4500 

TTTACGTTCG CTTCATAAAT TGGAACAGTA CCATGTTTAA GTCCGCCCAT AGTTATTAAA 4 560 

AATAAATTTG CTGCATGTGT TAATAGTGTA GTTCCCATAA CAATTCGTAT CAGACTTTTA 4620 

GACAAAACGA GATAGACACT AATTGCTGTG AGAATACCAC TAACAAAAAT CATAATAATT 468 0 

TCCACTATTC GTTCTCTCCA ATCGAAATAA TAATTGTCAT GACAGTACCA ACTACTGCAC 4740 

ATAAAACACC GAAATCAAAG AATACTGCTG TTGTCATATG AACAGGTTGT AATATAAATA 4 800 

ACGGTATATC AAATGTGACA TGGGTAAAGA AATTTTTGCC TAAAAACCAA CTTGCGATAG 48 6 0 

GCGTCGCAAT AGAAAAAACT AATCCGATAC CTATCAAGAT TTTAAAATCT AATGGGAAAA 4920- 

TTTTACGCAT TGTTTCTATA TCAAATGCAA TCGTAATGAT AACAAGTGAA CTTGCGAATA 4 980 : 

ATAATCCGCC GACGAAACCG CCACCAGGTG TATAATGTCC TGCTAAGAAA AGTGAAAAAC 504 0 

CAAAGACCAT TACCATGAAA AAGATAATAA CTGCAGCAAA TTGCAAAATT AGATCATTTT 5100 

GTTGTCTATT CATGATTTTT CAC CTCGTTA CCTTGCGTTT GACGCTTTTT ACGTAATTTA 5160 

ATCATTGTAT ATACAGCTAA TCCTGCGATA CCAAGCACAG ATGACTCGAA TAAAGTATCC 5220 

ATACCACGGA AATCAACAAG TATGACGTTT ACCATGTTTT TACCGTGAGC tAAATCATAA 5280 

ACGTGCTCTT GATAAAACTT AGATATCGAT TCAAAATGTC TATTTCCGTA TGCAATTAAA 534 0 

CCGATAATAA TGACGGACAA ACCAACACCA CCAGCAATTA AAGCATTAGT AAGCTGGAAT 5400 

GAGCGCTTTT CATTATAACG ATTTAAATTT GGTAAGTGGT AGAAGCATAA TAAGAACAAT 54 60 
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ATAAACAATA CAGACACAGC ATATCCAACT GCACTTAACA TAATGATGCT AAATAATCTT 5580 

GATTTAGCGA AAAGAATTAA AAAGGCAGCA CTTAATAATA AAATTACGAT ACAAACTTCG 5640 

AAAATTCTAA TCGGACTAAC GTCTTTAAAA TTAATGTTGA AAGGTACTGA GAATATAGTG 5700 

ACAAATGTTA ATAAAATTAA TGCACCAAAA ATGATAACTA AATTATTACG TGAATAATCG 5760 

GTAACATAGC TATTCGTCAT CTTTTCAGAG TAGTTTGGAA TAACATTTGC ACTTCTGTTG 5820 

TACCAATAAT TGAATGTTAG TTTACCAGGT TGTCGTTGCA ACAATTTCAC CCAATAACTA 5880 

AATGTCACAA TTAGTAAGAT ACCTAAAATA TAAATCACTA ATGTTGATAA AAAGGCAGGC 5940 

GTTAATCCAT GGAACATATG GAATTCAACA TCATCAATTA CCGTATGATT AATCGAAGag 6000 

TnAGCTGGTT CAATAATCGA ATTAGTTAAA ATGCCAGGGA ATAAACCAAA TACAATTACT 6060 

AATGTAGCTA AAATAGCTGG TGATAAAAGC ATTAATATTG ATACTTCGTG TGCTTTTTTA 6120 

GGTAATTGTT CAGGTTTATA TTGTCCGAAA AATATATGCA TTATAAATTT AATTGAATAT 6180 

ACAAATGTGA AGACACTGCC CACTATACCA ATGATTGGGA ATAGGTAGCC TAATGTATCA 6240 

ACACTGAATA AATTTGCTTG GCTTGCTGTA AATGTTGTTT CTAAAAATGA TTCTTTTGAT 6300' 

AAGAAACCAT TGAACGGTGG TACACCAGCg CATACTTAAT GCTGTAATAA CAGTGATTGT 6360 

AAATGAAATA GGCATAATTG TTAGTAAGCC ACCTAATTTC TTAACATCAC GTGTACCAGT 6420 

AGAATGATCC ACTGCACCTG TAATCATAAA TAGGGCACCT TTAAATG TTG CATGGTTGAT 6480 

TAAATGGAAT ATTGCAGCCG TAAATGCAGC AGCATATATT TTGCTATCAT - CGCCTTGATA - - 6540 

GTGATAACTA ATGGCACCGA TTCCAAGCAT CGCCATAATC ATACCTAATT GGGATACTGT 66 00 

TGAAAATGCC AGTATACCTT TCAAGTCTTG " TTGTTTTGTT' GCGTTTAGCG AAgCCCAGAA 666 0 

TAATGTAATT AAACCAACGA GTGTGACAGT CCATACCCAA CCTTGCGATG CTGCGAAGAT 6720 

TGGTGTCATT CGAGCGATTA AATATAACCC TGCTTTAACC ATTGTTGCTG AATGAAGATA 6780 

AGCACTGACT GGTGTAGGTG CTTCCATTGC ATCTGGTAGC CAAATATAAA ATGGAAACTG 684 0 

AGCAGATTTT GTAAAAGCAC CAATCATGAT TAAAATCATC GCAAAAATGA AGAATGGGCT 6900 

ATTTTGAATT TCAGAAGCAT GTTGAATCAT GTACTGAATG CTAAATGATT GTGTTGGTAT 6960 

AGCGAGTAAG ATGATACCAC CTAATAATGA TAGACCACCA AATACTGTGA TTATGAGCGA 7020 

TTTTTGAGCA CCATATATAG ATGCTTGTCG TTCGCGCCAG AATGAAATAA GTAAAAAACT 7080 

AGAAAATGAC GTTAGCTCCC AGAATAAATA TAGAATAATA ACATTATCTG AAAGTACGAC 7140 

SO ACCTAACATT GCACCCATAA ATAGTAATAA ATAACAATAA AAATTCCCTA GTTGTTCTGA 7200 

CTTACTTAAG TAGCCGATTG AATATAATAC TACTAAACTG CCGATTCCTG AAATAAGCAA 7260 
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CCAATTTAAG 


GTTTTCATrA 


CAGTATTACC 


TGACATCGTC 


GTTTTAATTA 


ATGTAAGCAT 


7380 




ATAAATAAAT 


ATGACGATAG 


GGACAGGTAA 


TACGAACCAT 


CCTAAATGTA 


TACGTTTAAA 


7440 


5 


AAATCTATAC 


AGGATAGGAA 


TAATGAGTGC 


GAATATTAAC 


GGTAATATCA 


CCG CAATATG 


7500 




TAACAAACTC 


ACTATGTTGT 


CCTCCTTTAA 


AAAATATTTA 


TGTTATTCAT 


TATACATGAA 


7560 




TGATATAGTT 


CTGAAAAACG 


TACACACTCC 


TTGTTGTGCT 


TTATTTTCAG 


AaGTATTTAA 


7620 


10 


ATAAGAAGAA 


ACACGTCATT 


TTTTATTTAA 


AATTTTCTTT 


GTATTGAAGT 


GAATAATCTT 


7680 




CTTTTAAGCG 


TGCTAAACTA 


GCTAAAGACA 


TTTCAGCATG 


TTTTGTTTGC 


TGAGCTTTAA 


7740 


15 


GTTTAGTTTC 


TAAATCTGTA 


ATTGCTTGTT 


GAAGTGAATC 


TTCATAGCGC 


AATACATCAA 


7800 


CATTGAAGTC 


GCGTAATTGT 


GAACGTTTCG 


TATAGCGTTT 


TTCAAAATGG 


GTTAATGCTT 


7860 




TGCGGTCATG 


GAAAAATACA 


CCTTCAGTTT 


CAGTAGGGTT 


ATGTAAATCA 


CCTTGTTTCG 


7920 


20 


GGTGTTTGAT 


AACTTGTTCA 


ACTTTAACAA 


GGACATCGTC 


TCCATTTTCT 


TCAACAATCG 


7980 




TGACACCATA 


GCTACCTGTT 


TTGTGTGAAA 


ATCGATATAG 


CTTCATGCTA 


TTTTCCTCCC 


8040 




TTAAAAGTAT 


GTTAATATAT 


ATGTATCATA 


ACATGAATGG 


AGAATATAAA 


TGGCTAACTA 


8100 


25 


TCCACAGTTA 


AACAAAGAAG 


TACAACAAGG 


TGAAATCAAA 


GTGGTTATGC 


ACACAAATAA 


8160 




AGGTGACATG 


ACATTCAAAT 


TATTTCCAAA 


TATTGCACCA 


AAAACAGTTG 


AAAATTTTGT 


8220 




GACACATG C A 


AAAAATGGTT 


ATTATGATGG 


AATCACATTC 


CACCGTGTCA 


TTAATGACTT 


8280 


30 


CATGATTCAA 


GGTGGCGATC 


CAACAGCTAC 


TGGTATGGGT 


GGCGAAAGTA 


TTTATGGCGG 


8340 




TGCTTTTGAA 


GATGAATTTT 


CATTAAATGC 


ATTTAACTTA 


TATGG CG CAT 


TATCAATGGC 


8400 




TAACTGAGGA 


CCTAATACTA 


ATGGTTCACA 


ATTTTTCATT 


GTTCAAATGA 


AAGAAGTACC 


6460 


35 


TCAAAATATG 


TTAAGTCAAC 


TTGCAGATGG 


TGGCTGGCCT . CAACCAATCG 


TTGATGCATA 


8520 




TGGCSAAAAG 


GGTGGTACAC 


CATGGTTAGA 


TCAAAAACAT 


ACAGTATTCG 


GTCAAATCAT 


8580 


40 


TGATGGTGAA 


aCTACATTAG 


AAGATATTGC 


AAATACAAAA 


GTGGGACCAC 


AAGATAAACC 


B640 


ACTTCATGAT 


GTTGTAATTG 


AATCTATTGA 


TGTTGAAGAA 


TAATATCTAA 


ACATAATTAA 


8700 




CTACCAACAT 


TTTAAACTCG 


GATAAAGCTA 


ATTTATGAAT 


GGATTAGTAT 


ATATTCCAAC 


8760 


45 


gAAAATAAAT AAACTAATAT GATGAGCAAT CTCAATATAT TTATCaAGAA AGCACAGTTT 


B 820 




TTAAATAGAT 


GTGTATTTTA 


AAGATAATAG 


TTGAGGTTGC 


TTTTTATGTT 


TTTACAGAGA 


8880 




ATTG CTATTC 


AAATAGTAAA 


TAAATTGAAA 


ACAAAGTAGC 


TGGATATCAT 


ATTpATTTAG 


8940 


50 


ATAGGAATTT 


GTTGCTAATT 


TTATTTGTAA 


ATCCAAGTTT 


GTAGAATTCT 


T ATT CATTTA 


9000 




TAAAATAATA 


TTCGTATGAT 


TTGATTTTTT 


AATTAGTCCA 


CCATTTCGAT 


TTGTGCTATG 


9060 
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AACATATCAA 

TCATACTGAA 

GAAATTTCTA 

AAAGCTTAAT 

AAAACAATCA 

AGAACGCTTA 

CAGATAAATC 

AAGAGATTAG 

AATATGaTTG 

TTAGATAGAA 

AAATGAGAAC 

GAATGAAAAA 

AAAGTAAATA 

ATCGATTTGT 

ATGTAGAACT 

CGAGTAATGT 

GTAATATTGA 

TCGTACAAAT 

TCGTAGCACC - 

GAGAAATGAC 

GTGCAATTGA 

ATCAATTTGT 

TCCCGGTCGC 

TTATAGTTGG 

TTACAGAGGA 

TGATGGATAC 

TAATTCACAA 

CTGACGAAGC 

AGCTACTACT 



GGTGCGTGTA 
GGACTGATTC 
TCAGAAGGCC 
CTATCATTAA 
GTATTAGATG 
CCAATCTGGA 
GTACCGAAAA 
TATCTATTAA 
ATTCTATTTA 
TCGTTGATTT 
TACAACTTAA 
ATTACATGTT 
CGAACCATTG 
GTTAGCCCCT 
TCCTTATATT 
GAGTGATGTC 
AGGACTAAAA 
ACATCATGGC 
AAGTCCAATT 
GAATGAAGAG 
AGCAGGGTTT 
ATCACCATAC 
TGTGATTGAA 
ATACAGATTA 
ACTCGTTAAT 
GCATGCAACG 
ATGGATAAAT 
TTTAGATGCA 
GGATTATCAA 



CTGGTATTCA 
ATATATCAGA 
AAATTGTTAA 
AGGATAATG A 
AAATCAGAGA 
TAAAACAGTC 
TCATACAAAG 
ATTTTATTAG 
CACGTACAAA 
GCaATATTGT 
AGTATTAAAC 
ATAGTCAACT 
TTTGATAAAG 
TTAACACATA 
GAAAAGCGTT 
GGAAAAGCAT 
CGATTAGCTA 
GGTGCACAAG 
TCTTTAAAAA 
ATTGAACAAG 
GATGGTGTTG 
TATAATAGAA 
GAAGTACTTA 
TCTCCAGAGG 
AAAATTAGCC 
ACACGTGAAG 
GGTCGTATGC 
GTTGAAAATG 
TTTGTTGAAA 



AC CATACGGT 
AATTATGGAT 
AGCTAAAATT 
TTACTTCAAA 
AACAGAAAAA 
AAAGCGAGCA 
GGTCTGAAAT 
ATACTAATCT 
TGGTTTAAGG 
ATGTGGATTT 
GAATTGCAAC 
CAATAATTTT 
TAGAATTACC 
TTTCTTCAAA 
CACAAGATGT 
TTCCAGGACA 
CAGCAATGAA 
CATTGCCTGA 
GTTTTGGTCA 
CAATCAAGGA 
AAATACATGG 
GAAATGATGT 
AAGCGAAAGA 
AAGCGGAGTC 
ATATGCCAAT 
GTAAATACGC 
CACTTATCGG 
TTGGTGTTGA 
AAATTAAAGA 



GCGTTTGTTG 
GACTACGTTC 
TTGTCTATAG 
AATTATGAGC 
TATGGGTTTC 
ATTCGAAACG 
GAAAGTTTCT 
CTTTTTGTCT 
TGACATATCC 
GTTTTTTTTA 
TATATAAACA 
AAGGAGGAAT 
AAATGGAGTA 
TGATGATGGT 
TGGTATTACA 
GCCATCAATC 
GAAAAACGGT 
ATTAACACCT 
GAAACAAG AA 
TTTTGGTGAA 
CGCGAATCAT 
ATGGGCAAAT 
AGCGTATGGC 
TCCAGGAATC 
CGACTATATT 
TGGACAAGAA 
TATTGGTTCA 
CTTAGTAGCC 
TGGACGGGAA 



AGACCCCTAA 

ATAATTTGAA 

ATGATGAAGG 

GTAAGAAGGA 

AAACACTTAA 

ACTAAAGGAA 

TAGACTATAA 

AGGATAACGT 

ATTATCTTTG 

TTTATTTTAG 

GATAATTGGA 

TAAGTAATGA 

GAGTTGAGAA 

ACTATTTCAG 

ATTAATGCTG . 

GCGCATGACA 

GCCAAAGCAC % 

GATGGAGACG 

CATAGTGCTA 

G CAACGCGAC 

TACTTAATTC 

CAATATAAAT 

AATAAAGACT 

ACAATGGAAA 

CATGTTTCAA 

AGACTGCCTT 

ATTTTCACAG 

ATTGGTAGAG 

GATGAAATTA 



9180 
9240 
9300 
9360 
9420 
9480 
954 0 
9600 
9660 
9720 
9780 
9840 
9900 
9960 
10020 
10080 
10140, 
10200 
10260 
10320 
10380 
10440 
10500 
10560 
10620 
10680 
10740 
10800 
10860 



55 



481 



EP0 786 519 A2 



AATTTAATGA AGGGTTTTAT CCATTACCAC GTA 10 953 
<2) INFORMATION FOR SEQ ID NO: 63: 

6 it) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8155 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : double 
<D) TOPOLOGY: linear 

10 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 

TTTGATAnAA AACTGAATnA ATTAAATGTA TCGATTCAAC CTAATGAAGT GAATTTACAA 60 

15 

GTTAAAGTAG AGCCTTTTAG CAnAAAGGTT AAAGTAAATG TTAAACAGAA AGGTAGTTTA 120 

GCAGATGATA AAGAGTTAAG TTCGATTGAT TTAGAAGATA AAGAAATTGA AATCTTCGGT 160 

20 AGTCGAGATG ACTTACAAAA TATAAGCGAA GTTGATGCAG AAGTAGATTT AGATGGTATT 24 0 

TCAGAATCAA CTGAAAAGAC TGTAAAAATC AATTTwCCAG AACATGTCAC TAAAGCACAA 300 

CCAAGTGAAA CGmAGGCTTA TATAAATGTA AAATAAATAG CTAAATTAAA GGAGAGTAAA 360 

25 CAATGGGAAA ATATTTTGGT ACAGACGGAg TAAGAGGTGT CGCAAACCAA GAACTAACAC 420 

CTGAATTGGC ATTTAAATTA GGAAGATACG GTGGCTATGT TCTAGCaCAT AATAAAGGTG 480 

AAAAACACCC ACGTGTACTT GTAGGTCGCG ATACTAGAGT TTCAGGTGAA ATGTTAGAAT 54 0 

30 CAGCATTAAT AGCTGGTTTG ATTTCAATTG GTG CAGAAGT GATGCGATTA GGTATTATTT 600 

CAACACCAGG TGTTGCATAT TTAACACGCG ATATGGGTGC AGAGTTAGGT GTAATGATTT 660 

CAGCCTCTCA TAATCCAGTT GCAGATAATG GTATTAAATT CTTTGGATCA GATGGTTTTA 720 

35 

AACTATCAGA TGAACAAGAA AATGAAATTG AAGCATTATT GGATCAAGAA AACCCAGAAT 780 

TACCAAGACC AGTTGGCAAT GATATTGTAC ATTATTCAGA TTACTTTGAA GGGGCACAAA 84 0 

AATATTTGAG CTATTTAAAA TCAACAGTAG ATGTTAACTT TGAAGGTTTG AAAATTGCTT 900 

40 

TAGATGGTGC AAATGGTTCA ACATCATCAC TAGCGCCATT CTTATTTGGT GACTTAGAAG 96 0 

CAGATACTGA AACAATTGGA TGTAGTCCTG ATGGATATAA TATCAATGAG AAATGTGGCT 1020 

4S CTACACATCC TGAAAAATTA GCTGAAAAAG TAGTTGAAAC TGAAAGTGAT TTTGGGTTAG 1080 

CATTTGACGG CGATGGAGAC AGAATCATAG CAGTAGATGA GAATGGTCAA ATCGTTGACG 114 0 

GTGACCAAAT TATGTTTATT ATTGGTCAAG AAATGCATAA AAATCAAGAA TTGAATAATG 1200 

SO ACATGATTGT TTCTACTGTT ATGAGTAATT TAGGTTTTTA CAAAGCGCTT GAACAAGAAG 1260 

GAATTAAATC TAATAAAACT AAAGTTGGCG ACAGATATGT AGTAGAAGAA ATGCGTCGCG 1320 
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CTGGTGATGG TTTATTAACT GGTATTCAAT TAGCTTCTGT AATAAAAATG ACTGGTAAAT 1440 

CACTAAGTGA ATTAGCTGGA CAAATGAAAA AATATCCACA ATCATTAATT AACGTACGCG 1500 

5 TAACAGATAA ATATCGTGTT GAAGAAAATG TTGACGTTAA AGAAGTTATG ACTAAAGTAG 1560 

AAGTAGAAAT GAATGGAGAA GGTCGAATTT TAGTAAGACC TTCTGGAACA aACCATTAGT 1620 

TCGTGTCATG GTTGAAGCAG CAACTGATGA AGATGCTGAA aGATTTGCAC AACAAATAGC 1680 

10 

TGATGTGGTT CAAGATAAAA TGGGATTAGA TAAATAAATA CTGTATTACA AATGAGCCGA 1740 

TGCGTATG C A nTcgtTTTTT GTGTTTGTAG AAATAATTTA TAGTACAAAC GTAAAATGAT 1800 

ATAAACAAAA TAAAAACAAA GTAATCAATA TGTAATATAA AATACACTGG TACTCAATAT 1860 

15 

ATAATGATGA TAAAATTAAT TTTAATTAGA TAGAGTTGCT TTGTGTTTTT AACGCAGATG 1920 

CTACTACTTA TCTTAACAGT TGATTAAGTG AAATCATTTA ACAGCGAGAA TAATCAACCA 1980 

2Q GGAGGATGAC TTAATGAATT TATTCAGACA ACAAAAATTT AGTATCAGAA AATTTAATGT « 2040 

CGGTATTTTT TCAGCTTTAA TTGCCACTGT TACTTTTATA TCTACTAACC CGACAACAGC ' 2100 

GTCTGCAGCA GAGCAAAATC AGCCTGCACA AAATCAACCA GCACAACCAG CTGATGCCAA 2160 

25 TACACAGCCT AACGCAAATG CTGGTGCTCA AGCTAATCCT ACAGCACAGC CAGCTGCACC 2220 

TGCCAACCAA GGACAACCAG CAGTACAACC AGCAAACCAA GGTGGACAGG CTAATCCAGC 2280 

AGGAGGAGCA GCACAACCAA ATACACAACC AGCTGGACAA ./ GGTGATCAAG CTGATCCGAA 2340 

30 TAACGCTGCA CAAGCACAAC' CTGGAAATCA AGCAACACCG : GCAAACCAAG CAGGTCAAGG' 2400 

AAATAACCAA GCAACACCTA ATAATAATGC AACACCGGCA AATCAAACAC AGCCAGCGAA 2460 

TGCTCCAGCA GCAGCGCAAC CAGCAGCACC TGTAGCAGCA ; AACGCACAAA" CTCAAGATCC 2520 

55 AAATGCTAGC AATACTGGTG AAGGCAGTAT TAATACX3ACA ^TTAACATTTG ATGATCCTGC 2580 

CATATCAACA GATGAGAATA GACAGGATCC AACTGTAACT GTTACAGATA AAGTAAATGG 264 0 

TTATTCATTA ATTAACAACG GTAAGATTGG TTTCGTTAAC TCAGAATTAA GACGAAGCGA 2700 

40 

TATGTTTGAT AAGAATAACC CTCAAAACTA TCAAGCTAAA GGAAACGTGG CTGCATTAGG 2760 

TCGTGTGAAT GCAAATGATT CTACAGATCA TGGTAACTTT AACGGTATTT CAAAAACTGT 2 820 

AAATGTAAAA CCAGATTCAG AATTAATTAT TAACTTTACT ACTATGCAAA CGAATAGTAA 2 880 

45 

GCAAGGTGCA ACAAATTTAG TTATTAAAGA TGCTAAGAAA AATACTGAAT TAGCAACTGT 2 94 0 

AAATGTTGCT AAGACTGGTA CTGCACATTT ATTTAAAGTA CCAACTGATG CTGATCGTTT 3 000 

50 AGATTTACAA TTTATTCCTG ACAATACAGC AGTTGCTGAT GCTTCAAGAA TTACAACAAA 3 060 

TAAAGATGGT TATAAATACT ATTCATTCAT TGATAATGTA GGTCTATTCT CAGGATCACA 3120 
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15 



20 



25 



35 



45 



TAATACTGAA AT CGGTAA CA 
ATATGAAGTA ACATTACCAC 
CCCTAATGGT AATGAAGACA 
TGCAAATAAA GTTACATTTA 
AGAAGTTTTA TTCCCAGATA 
CGATACACCT AAAAATATTG 
TGTAATTAAT AATGCGCAAC 
GAAATGAACA AAGATGCGTT 
ACAACAGCAT CAATTGCAGA 
GAAGATGCGA ATCATGTTAA 
GTAACTAAAT TACAAGCTGC 
AAAGCTCAAG AAAAGGTTAC 
GCAGCACTTG TAACTAAAAT 
CAAACTACAG CACAAGGTGT 
GATGTGATTA CAC CAACAGT 
ACTCGTAAAC AACAAATTAA 
AATGATAAAA TTGGTAAAAT 
AATGCACAAG TAGAAGCCAT 
GCTACAACAG CTAAAGCAGC 
GATCAAG CAC CTTTAAATCC 
ATTAATGCAG CTAAAGTTTC 
TTAGAAAGAG TTAAAAACGA 
ACAAAAATGG ATGCCTATAA 
GCTACAGTTT CAAATGCAAC 
GCTCAAAAGC AAGGTTTACA 
ACAAAATCAA AAGTATTAGA 
GCAGCTGATA CGGAAGTAGA 
AATGCTTCAA CTACAGAAGA 
GAAGCAAGAA CAAATCTTGA 



ATGGTAATTT 
AAGGTGTAAC 
GTACAGTATT 
CAAGCCAAGG 
AATCTTTAAA 
ATTTTAATGA 
CAGAAGTaCA 
GCAACAACAA 
ATACAATAAA 
AACTGCAAAT 
ATTAATTGAT 
AGCAG CACAA 
TAACAATGAT 
CACAACTGAA 
TAAACCTCAA 
AAAGTCAAAT 
TGAAACAAAG 
TAAAACAAAA 
AGCT CTTGAA 
TGATACAACA 
TGGTGTTAAA 
AGAAATCTCA 
TGAAGTTAAA 
AAATGAAGAA 
TGACATCCAA 
TAAAATCAAT 
AAACGCATAT 
AAAACAAGCT 
TGCTGCAAAT 



TGGTGCTTCA 
TTACGTTAAT 
GAAAAATATG 
TGTGACAACG 
ATTATCATAT 
AAAATTAACA 
CTAACTGCAG 
GTAAACTCAC 
CTTAAACAAC 
CGTGCATCTC 
AATCAAGCAG 
CAAAGTAAAA 
AAAAATAATG 
AAAGATAATG 
GCGAAACAAG 
GCATCATTAC 
GCAATTAAAG 
GCAATCAATG 
GAATTTGACG 
AATGAAGAAG 
GCAATTGAAG 
AAAATTGAAA 
CAAGCTGCAA 
GTAGCAGAAG 
GTTGTTAAAT 
GCAATTCAAA 
AATACACGTA 
GCAfATACAG 
ACAAACAGTG 



TTAAAAGCAG 
AATTCATTAA 
ACTGTTAATT 
GCACGTGGTA 
AAAGTTAATG 
TATCGTACTG 
ATCCATTTTC 
AAGTTGATAA 
AAGCAGATAC 
AAGCGGATAT 
CAATTGCTGA 
AAGTTACGCA 
CAATCGCAGA 
GTATCGCAGT 
ATATTATCCA 
AAGATGAAAA 
ATATTGATGC 
ATATTAATCA 
AAGTTGTTCA 
TAGCGGAAgC 
CGACAACGAC 
ATATTACTGA 
CAG CT AG AAA 
CTGATGCAGC 
CAAAACAGGA 
CACAAGCAAA 
AACAAGAAAT 
AATTAGATAC 
ATGTAACAAC 



ATCAATTTAA 
CTACAACATT 
ATGATCAAAA 
CACACACTAA 
TTGCGAATAT 
CTTCAGATGT 
AGTAGCGGTT 
TAGTCATTAC 
TATTTTAAAT 
. TGATC3GTTTA 
ATTAGATACT 
AGATGAAGTT 
AATTAATAAA 
GTTAGAACAA 
AG CAGTTACA 
AGATGTAGCA 
AGCAACAACA 
AACTACACCT 
AGCACAAATT 
TATTGAACGT 
TGCACAAGAT 
CTCTACGCAA 
AGCTCAAAAT 
AGTAGATGCA 
AGTTGCTGAT 
AGTTAAACCT 
TCAAAATAGC 
TAAAAAGCAA 
AGCTAAAGAC 



3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4 680 

4740 

4800 • 

4 860 

4920 
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GCGGAAATCG CTCAAAAAGC AAGTGAACGT AAAACAGCAA TTGAAGCAAT GAATGATTCG 5040 
ACTACTGAAG AACAACAAGC AGCGAAAGAC AAAGTGGATC AAGCAGTAGT- TACTGCAAAC • 5100 

S GCTGATATAG ATAATGCTGC AGCAAACAAT GATGTGGATA ATGCAAAAAC TACAAATGAA 5160 

GCTACAATCG CAGCCATTAC ACCTGATGCA AATGTTAAAC CAGCAGCAAA ACAAGCAATT 5220 

GCAGATAAAG TACAAGCTCA AGAAACAGCA ATTGATGGAA '. ATAACGGCTC . AACAACTGAA 5280 

10 GAAAAAGCAG CTGCTAAACA ACAAGTTCAA ACTGAAAAAA CAACAGCTGA TGCCGCAATA . 534 0 

GATGCAGCAC ATACAAATGC GGAAGTTGAA GCGGCTAAAA AAGCAGCAAT TGCTAAAATT 5400 

GAAGCGATTC AGCCAGCAAC AACAACTAAA GATAATGCGA AAGAAGCAAT TGCTACGAAA 5460 

1S 

GCGAATGAAC GTAAAACAGC AATCGCTCAA ACGGAAGACA TTACTGCTGA AGAAATTGCA 5520 

GCGGCTAATG CGGACGTAGA TAATGCTGTG ACACAAGCAA ATAGCAACAT TGAAGCTGCT 5580 

AATAGTCAAA ATGATGTAGA CCAAGCGAAA ACGACAGGTG AAAATAGTAT TGATCAAGTA 5640 

20 

ACACCAACAG TTAATAAAAA AGCAACTGCA CGTAATGAAA TCACAGCAAT TTTAAATAAC - 5700 

AAATTGCAAG AG ATTCAAG C tACGCCAGAT GCAACAGATG AAGAAAAACA AGCAGCTGAT 5760 

2S GCTGAAGCAA ATACTGAAAA TGGTAAAGCA AATCAAGCCA TTTCAGCAGC AACTACTAAG 5820 

GCACAAGTTG ATGAAGGTAA AGCAAATGCA GAAGCAGGGA TTAATGCGGT AACACCAAAA 5880 
GTTGTGAAGA AACAAGCGGC TAAAGATGAA ATTGATCAAT TACAAGCAAC GCAAACAAAT . 5940 

30 , GTTATCAATA ATGATGAGAA CGCTAGAACA GAAGAAAAAG -AAGGAGCTAT: TCAAGAATTA 6000 

GCAACAGCAG TTACAGACGC GAAAAATAAT ATTACAGCTG CAACTGATGA TAATGGTGTA 6060 

GATCAGGCGA AAGACGCTGG AAAGAATTCA ATTCAAAGCA CGCAACCAGC - AACAGCGGTT 6120 

35 AAATCAAATG CTAAAAATGA TGTTGATCAA GCTGTGACAA CTCAAAATCA AGCAATTGAT ■> - ^ ^6180 

AATAGAACTG GTGCTACAAC TGAAGAGAAA AATGCAGCAA AAGATTTAGT TTTAAAAGCT 6240 

AAAGAAAAAG CGTATCAAGA TATCTTAAAT GCACAAACAA CTAATGATGT TACGCAAATT 6300 

40 

AAAGATCAAG CAGTTGCTGA TATTCAAGGT ATTACTGCAG ATACAACAAT TAAAGATGTT 6360 

GCGAAAGATG AATTAG CAAC AAAAGCAAAC GAACAAAAAG CGCTTATTGC ACAAACTGCA 6420 

GATGCGACTA CTGAAGAAAA AGAACAAGCA AATCAACAAG TAGACGCACA ATTAACACAA 6480 

45 

GGTAATCAAA ATATTGAAAA TGCACAGTCA ATCGATGATG TAAACACTGC AAAAGATAAT 654 0 

GCAATTCAAG CAATTGACCC AATTCAAGCA TCAACAGATG TTAAAACGAA TGCAAGAGCG 660 0 

50 GAATTGCTAA CTGAAATGCA AAATAAAATA ACTGAAATAC TTAATAATAA TGAGACTACT 6660 

AATGAAGAAA AAGGTAACGA TATTGGACCA GTTAGAGCAG CATATGAAGA AGGTTTAAAT 6720 
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AAAGTTCAAC 


AACTTCATGC 


AAATCCTGTT 


AAGAAACCAG 


GAGGTAAAAA 


AGAATTAGAT 


6840 




CAAGCTGCAG 


CTGATAAGAA 


AACACAAATA 


GAACAAACAC 


CAAATGCATC 


ACAACAAGAA 


6900 


5 


ATTAATGATG 


GAAAACAAGA 


AGTTGATACT 


GAATTAAATC 


AAGCGAAAAC 


AAATGTCGAT 


6960 




CAATCATCAA 


CAAATGAATA 


TGTTGATAAT 


GCAGTTAAAG 


AAGGAAAAGC 


TAAAATTAAT 


7020 




GCAGTTAAAA 


CATTTAGTGA 


GTACAAAAAA 


GATGCTTTAG 


CTAAAATTGA 


AGATGCATAT 


7080 


10 


AATGCTAAAG 


TAAACGAAGC 


GGATAACTCT 


AACGCATCGA 


CTTCAAGTGA 


AATTGGTGAA 


7140 




GCGAAACAAA 


AACTTGCTGA 


ATTAAAACAA 


ACTGCGGATC 


AAAATGTTAA 


TCAAGCTACT 


7200 


15 


TCTAAAGATG 


ACATTGAAGT 


TCAAATTCAT 


AATGACTTAG 


ATAATATTAA 


CGATTACACA 


7260 


ATTCCAACAG 


GTAAAAAAGA 


ATCAGCTACA 


ACAGATTTAT 


ATGCTTATGC 


AGATCAGAAG 


7320 




AAAAATAATA 


TTTCAGCTGA 


CACTAATGCA 


ACACAAGATG 


AAAAGCAACA 


AGCAATtAAG 


7380 


20 


CAAGTTGACC 


AAAATG TTCA 


AACTG CATTA 


GAAAGCATTA 


ATAATGGTGT 


GGATAATGGT - 


7440 




GACGTTGATG 


ATGCATTAAC 


ACAAGGTAAA 


GCAGCAATTG 


ATGCTATTCA 


AGTAGATGCT 


7500 




ACTGTTAAAC 


CTAAAGCGAA 


CCAAGCTATT 


GAAGTTAAAG 


CAGAAGATAC 


GAAAGAATCT 


7560 


25 


ATTGATCAAA GTGACCAGTT AACTGCTGAA 


GAAAAAACTG 


AAGCATTAGC 


AATGATTAAA 


7620 




CAAATTACAG 


ATCAAGCTAA 


ACAAGGTATT 


ACTGATGCAA 


CAACAACTG C 


TGAAGTTGAA 


7680 




AAAGCGAAAg 


cTCaAGGACT 


TGAAGCATTT 


GATAACATTC 


AAATCGACTC 


AACAGAAAAA 


7740 


30 


CAAAAAGCTA 


TCGAAGAATT 


AGAAACTGCA 


CTAGACCAGA 


TTGAAGCAGG 


TGTAAATGTC 


7800 




AACGCTGATG 


CTACAACTGA 


AGAAAAAGAA 


GCGTTTACGA 


ATGCTTTAGA 


AGACATTTTA 


7860 




TCAAAAGCAA 


CTGaAGATAT 


TTCTGATCAA 


ACTACAAATG 


CAGAAATCGC 


TACTGTCAAA 


7920 


35 


AATAGTGCGG 


TTGAACAACT 


TAAAGCACAA 


CGTATTAATC 


CTGAAGTTAA 


GAAAAATGCT 


7980 




TTGG/LAGCAA TCAGAGAAGT GGTTAACAAG caaataggaa 


tAATTAAAAA 


TGCAGATGCA 


8040 


40 


GATGCATCGG 


CGGAAAGAnA 


TTGCACGTAC 


GGGATTTAGG 


TAGATATTTT 


GGACCGATTT 


B1C0 


GCTGGATAAA 


TTTAGGGTnA 


AACCCCAACC 


AATGCCGAAG 


TTGCCTGAAT 


TACCA . 


8155 




(2) INFORMATION FOR SEQ ID NO: 64 











(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1630 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 



so 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 
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CTGTTTTATT TGCAGCACCC ATACTGGAAA TCACTTTAAT CCCTCGGTCA AGACACTCTT . 120 

TCATTAAGTG TACTTTGTAC ATTATTGTAT CACTTGCATC TACAAAATAA TCTATATCGT 180 

5 AGTTATCGAA AATTTCTTCA TATGTCTCTT CTGTATAAAA CATATGTAAG GGCGTGACTT 24 0 

- TACAATCTGG ATTAATTAAT -TTAATACGTT CTTCCATCAA AGAAACTTTA CTTTGTCCTA 300 

CCGTTGTAGT TAAAGCGTGT AATTGTCTGT TTACATTTGT AATATCAACA TGATGTTTAT 360 

10 CTATTAATAT AATATGACCA ATATTCGTTC TTGCTAATGC TTCAGCAGCA AATGAACCAA 42 0 

CACCTCCAAC GCCAAGTATG ACAACAGTTT GTTGCTTCAA TAAATCTAAA CCTTGTTGTC 480 

CAATCGCTAG TTCATTTCTT GAAAATTGAT GTTTCATTAT TTTACCTCTT TCACTGATTT 54 0 
15 * 

ATACATAAGT ACATAGTAAC TTAAAATTTT ATATTTAGCA TTATCACTTT GATTATTTTC 600 

CCAAAATTCA ACGAGGAAAC ATTTATTAAA CGCTATAAAA CCCAACTAAT TCTTTATTAA 660 

AAACTTAAAG AAACGCATAA AAATACGCAA GACAAAGTCT TGCGTATCGA TAGAGTCCGT 720 

20 

ATTGCCGTAG TTATAATAGC TTGATCATTC GGCCTGTTAT ATACAGGTGG GTGCCCTGTT 780 

TCTTGTTTTG TACGTCCTTC AT AT AAGG CG TGTACGCTGC AAGAAAACCC ATTGGGCTCC 840 

2S CTTGATCAAA GAGTGTTAGG CCCAAATTAA AAAGCAAACT TACGAACAAC TCAGATGACT 900 

ATCTTATGAT GTTATATTAC CACATAATTA AAATTAATGA AATTATAACA AACCAAAGTT 960 
TATTGATTTT TTAAAATTTA GTGACGAATT CGCAAAGAAA GTTCTTCTAA TTGTTTATCA ' 1020 

30 GAAACTTCAC TAGGCGCATT CGTTAATAAA CATGT AG CAG ATGCTGTTTT AGGGAATGCG 1080 

ATTGTATCTC TCAAGTTTGT TCTATTAGTC AATAACATGA CTAATCGGTC t AATCCTAAT 1140 

GCAATACCGC CATGTGGTGG TGCACCATAT TTAAATGCAT CTAGTaAGAA GCCGAACTGT 1200 

55 TCCTgTGCTT GTTCTTTAGT AAATCCAAGA ACTTCGAACA TTTTTTCTTG TAACTCACCA 1260 

TCATGAATTC TGATTGAACC GCCACCTAAT TCATAACCAT TTAATACTAT GTCATAAGCA 1320 

TTTGCCTCAG CTTCtTCTGG CGCAGTGCCA AGCTTAGCAA TATCAGCTTC TTTTGGAGAT 13 80 

40 , 

GTAAATGGAT GATGTGCTGC AACGTAACGT TTCGCATCTT CAT CAT ATT C TAATAATGGC 144 0 

CAATCTGTCA CCCATAAGAA GTTTAATTTT GTTTCATCGA TTAAACCTAA TTCTTTAGCT 1500 

AATTTGACAC GTAATGCACC TAAACTTTGT GCAACGACAT TTGGTttGTC TGCAACAAAC 1560 

45 

ATTACTAAGT CACCAGCTTC AGCACCAGTT AATGTAAGTA ATGTTTCAAC ATTTTCTGTT 1620 

CAAAGAAACG 1630 

SO <2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A> LENGTH: 732 base pairs 
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<C) STRANDEDNESS : double 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 
CAATTGGACA TCTTGTATGA AAAGGACAAC CTTGCGGCGG ATTACTTGGC GAAGGTAATT 
CTCCTTTTAA TATAATTCTA TTGTTATTAT GTTTATCAAT TTGTGGTATT GATGAAATCA 
ACGCTTTTGT ATATGGATGT TTGGGATTTT CATAAATTTC TTTATCAGAT GCGATTTCAA 
CTATATGACC TAAATACATA ACTCCAATGA CATCACTTAT ATGTTTTACT ACACTTAAAT 
CATGTGCGAT AAATAAATAG CTTAAGTTAA ATTGTTCTTG TAAATCTTTT AATAAATTCA 
GTACTTGAGA TTGAACAGAT ACATCTAATG CACTTACAGG CTCATCAGCA ACAATTAAAC 
TCGGACGCAA AGCCAATGCT CTTGCAATTC CCACTCTTTG TCTCTGTCCA CCTGAAAATT 
CATGTGCATA TTtATAATAT GCATCTTCAC TTAGGCCAAC ACATTTTAAT AAATATAGTA 
CTTCTTTTTT TATTTCTTCT TTTGGCAATT TTTTATAATT TAAAATAGGT TCTGAAATGA 
TATCTCCAAC CATTTGCATC GGATTCAATG ATGCATACGG ATCTTGAAAT ATCAT CTGAT 
ATTGTTGTCG TGATTTTCTG AGTTTTTTAC CTTGTAATCT TGTTATATCT TCACCATTAA 
CAATTATTGA GCCTGAAGTT GCATCTTCAA GCCTGATAAT CACTTTACCT AACGTTGACT 
TACCACAACC CG 

<2) INFORMATION FOR SEQ ID NO: 66: 

^( i), -SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 5838 base pairs 

(B) TYPE: nucleic acid 

,J (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 
AATATATTCA TATGTTTCAT CAACAATATT AGCTGCTTTT TGAATTAAAG CAATTTCGTC 
AGCATCTTTG ACGTCTCTAA TTTTATCTAC AGTATTAGAA ATGCTTATTA ATGATATACG 
GCTTTTATTT AATTCAAGGT ATGTATCATA ACTTACATGA TGCCCCTCAA AACCTACATT 
TTCAAAATTT TCTTGGTGTA GCAATTCTTT AATCTCACCA ATAATAGTAG ATTTACGATT 
AATAATTTCA TAATTTGGCG CCTGCTTAGT TGCTTGATCA ATATATCTAA AGTCTGTTAT 
CAAATATTGT TTATCTTTAG ATATGATAAG TGCTCCACTG GTACCAGTAA AACCTGATAA 
ATATCTTCTA TTGTAATCCG AAAGAATGaT AATCGCATCT AAATGTTTTT GTTCTAAAAT 
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CAACTTTATA 


CATTAAAATA 


ATATCATAAT AAGGATAAAA AATAATAGAT 


ATTGATTTTA 


540 




GGGAGATAGT 


AATGAAAAAA 


TTGGTTTCAA TTGTTGGCGC AACATTATTG 


TTAGCTGGAT 


600 


5 


GTGGATCACA 


AAATTTAGCA 


CCATTAGAAG 


AnAAAACAAC AGATTTAAGA 


GAAGATAATC 


660 




ATCAACTCAA 


ACTAGATATT 


CAAGAACTTA 


ATCAACAAAT TAGTGATTCT 


AAATCTAAAA 


720 


10 


TTAAAGGGCT 


TGAAAAGGAT 


AAAGAAAACA 


GTAAAAAAAC TGCATCTAAT 


AATACGAAAA 


'780 


TTAAATTGAT 


GAATGTTACA 


TCAACATACT 


ACGACAAAGT TGCTAAAGCT 


TTGAAATCCT 


84 0 




ATAACGATAT 


TGAGAAAGAT 


GTAAGTAAAA ACAAAGGCGA TAAGAATGTT 


CAATCGAAAT 


900 




TAAATCAAAT 


TTCTAATGAT 


ATT CAAAGTG 


CTCACACTTC ATACAAAGAT 


GCTATCGATG 


960 




GTTTATCACT 


TAGTGATGAT 


GATAAAAAAA 


CGTCTAAAAA TATGGATAAA 


TTAAACTCTG 


1020 




ATTTGAATCA 


TGCATTTGAT 


GATATTAAAA ATGGCTATCA AAATAAAGAT 


AAAAAACAAC 


1080 


20 


TTACAAAAGG 


ACAACAAGCG 


TTGTCAAAAT 


TAAACTTAAA TGCAAAATCA 


TGATAGGAGT 


1140 




CTTTTAATGC 


GTAATATAAT 


ATTTTATCTT 


GTACTTATTA TTGCTGCGAT 


TGGATTAGTA 


1200 




ATGAATCTAG 


ATGCCTTTAT 


TTTTTCAATC 


GTCAGAATGT TAATCAGCTT 


TGcgTAaTAG 


1260 


25 


CTGGTATTAT 


TTATCTGATT 


TATTATTTCT 


TCATCTTAAC TGAAGACCAA 


CGCAAATATC 


1320 




GCAAAGCAAT 


GCgTrAaGTA 


TAAAAGAAAT 


CAAAGAAGAA AATAGATAAA 


AAAACGGAAG 


1380 




CACTTGTAGG 


TAAAATAGTC 


TACGTGCTTC 


CAl-m-lTAT TCTAAAAACT 


ACTTTCTAAA 


144 0 


30 


CATCCATTCA 


TCTGAACGAT 


ATTTTTCAGT 


TAATTCTTCC ACTTCTGCCA ATTGAGCTTC 


•1500 




TGtTAATTCA 


AGTGGCTTTA 


ATTCTATATT 


TAAACCTTTC TTAAAACCTT 


TCTCGAAAGC 


1560 


35 


TTCTTCCATT 


TGACTAATAG 


TAATGTGTTC 


ATCTGAAATA , TCATTGATGG 


CAACTGCTTT 


1620 


TTCAACGAAT 


GCCTCTTTCA 


TTTTTAATTT 


TAATCTTTCA TTTTTATAAA Tr AACATATC . 


1680 




AAACAGTTCA 


TCAATATCAA 


TATCTTGTAA 


AATCGAACCG TGTTGGAGGA 


TTACGCCCTT 


1740 


40 


TTGTCTCGTT 


TGAG CACTCC 


CAGCAATCTT 


ACGGCCTTCA ACAACTAGCT 


CATACCAACT 


1800. 




TGGTGCATCA 


AAACACACTG, AACTTCGAGG 


TTGTTTTAAT TTTTGACGCT 


CTTCAGGCGT 


1860 




TTTAGGTACC 


GCAAAATAAG 


TATCAAATCC 


TAAGTTTTTA AATCCTTCTA 


ATAATCGTTG 


1920 


45 


TGAAATCACT 


CTGTACGCTT 


CTGTAACTGT 


AGAAGGCATA TTCGGATGCG 


ATTCAGGCAC 


1980 




AATCACACTG 


TAAGTTAACT 


CTTTATCATG 


TAGCACCCCA CGGCCACCAG 


TTTGACGCCT 


2040 




TACGAGACCA 


AAACCTTTCT 


CTTTAACCTT 


ATCAAT AT CA ATTTCTTTTT 


GTAGCCTTTG 


2100 


SO 


GAAATACCCT 


ATTGATAATG 


TTGCAGGATT 


CCATGTGTAA AAACGTATAA 


CTGGATCAAT 


2160 




TTCACCTCTA 


GAGACAAAAT 


TTAATAACGC 


TTCATCCATT GC CAT ATT AT 


AATATGGGTC 


2220 
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AAATGTATAA T ATTTG ATT C GCTAATTAAT CAATTTAACT AAATGAATAA TAATTGCAAT 2340 

TCTTTAGTGA AATATTTTGA TAATTTGACC TAACAGTCTT ATAATTATAT TATCGTTTAA 2400 

TTAGGGAGGA TGCAAGATGA GTGCTAGTTT GTACATCGCA ATAATTTTAG TTATAGCAAT 2460 

TATTGGTTAT ATGATTGTTC AACAAATTCT TAACAAGCGA GCTGTTAAAG AATTAGATCA 2 520 

AAATGAATTC CATAATGGGA TTAGAAAAGC TCAAGTCATC GATGTTAGAG AGAAAGTTGA 2580 

CTATGACTAC GGTCACATTA ATGGGTCTCG CAATATTCCT ATGACAATGT TCAGG CAACG 2640 

ATTCCAAGGA TTAAGAAAAG ATCAACCGGT ATACTTATGT GATGCCAATG GGATTGCTAG 2700 

CTATAG AG CC GCTCGTATTT TGAAAAAGAA TGGATATACA GATATCTATA TGTTAAAAGG 2760 

CGGCTATAAA AAATGGACTG GAAAAATAAA GTCTAAAAAA TAGTTTTTGT AAATTTAATA 2820 

TACGATTTAA TAAAATCTGA GTGTTAATTG ATCATCAATA ACAATACTCA GATTTTAATT 2880 

20 TTTTAACAAA GTCTGTTACT ATATTTCTCT AGCTTCACTG ATCATTAAAC TTAGTTTCAG 2940 

CATAATAAAG AAAGTTCAGC TCATTTTCAA TACGATTCAA TTACCGCAAT CTAAAAAATG 3000 

AAAAGACAAT TTCTATGAAA GAATAATACC AAACCCTAAG AGTTATTACT TCGGTTTAGT 3060 

TTTCTTGTTT AAATAGAAAT TGTCTTTTTC AATTGATTTT GAAACCATTA TCCTTAAATC 3120 

TTCATACAAA GTTAGAATAA TAATTCTCGG AATATGTGTT TAATACTTTA TTTTTCCTGT 3180 

TTAAGATTTT CAAACTTTAA TATTGGTTTA CGAGCAGCTG TAGCTTCGTC TAATCGATCA 3240 

ATCACAGTTG TATGTGGTGC TTCTAGCacT TTATCAGGAT CATTTTTAGC TTCTTCAGCA 3300 

ATACTAATTA ATGTATCGAT AAAATAATCA AGTGTTTCTT TAGACTCTGT CTCAGTCGGT 3 360 

TCAATCATCA TACCTTCTTC AACATTTAAT GGGAAGTATA TTGTTGGTGG ATGTACACCG 3 420 

35 

AAATCTAATA ATCGCTTAGC CATGTCTAAA GTACGTACAC CAAATTCTTT TTGACGCACA 3480 

CCACTTAACA CAAACTCGTG TTTACAATAT TGTTTATAAG GTATTTCAAA GTGTTTAGAT 3540 

40 AAACGTGCTT TAATATAATT CG CATTAAG A ACCGCTGCTT CAGAAACCTC TTTAAGTCCA 3600 

GTTGCTCCCA TAGTTCGAAT ATACGTATAA GCTCTTAAGT AAATACCAAA GTTACCATAA 3660 

AATGGTTTTA CACGTCCGAT AGAATTTTTA ATGTCATTAT CATATTTAAA TTTGTCGCCA 3720 

45 TCTTTAATAA CCATTGGCTT TGGTAAGTAA CTTGCTAGTT CTTTTACTAC ACCGACTGGA 3780 

CCTGAACCAG GACCGCCACC ACCATGTGGA CCAGTAAATG TTTTATG CAA GTTTAAATGA 3 840 

ACAGCATCAA ATCCCATATC TCCTGGGCGA ACTTTGTCCA TAATAGCGTT TAAATTCGCA 3 900 

SO 

CCATCATAAT ATAATAGACC ACCAGCATTA TGGACGATTT CACGGATTTC CATAATATTT 3 960 

TTTTCGAAAA TACCTAAAGT GTTTGGATTA GTTAACATAA TAGCTGCTGT ATTTTCATTT 4 020 
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GATTTAAATC CTGCAAATGa AGCTGAGGCT GGaTTCGTAC CATGCGCAGA ATCTGGcACA 414 0 

ATGACTTCAT CACGATGACC TTCACCATTA TTCTCATGGT AAGCTTTAAA TAT CATCAAT 4200 

GCAGTGCATT CACCATGTGC GCCAGCAGCT GGTTGTAATG TCACCTCATC CATACCAGTA 4260 

ATTTCTTTTA ATTCTTCTTG CAAACTATAA ATAATTTCTA ATGAACCTTG AACTTGATCT 4320 

TCATCTTGTA ATGGATGTGA TTCACTAAAT CCTGGTATTC TAGCAACCTT TTCATTAATT 4380 

TTAGGGTTAT ACTTCATCGT ACATGAACCC AATGGATAAA ATCCGTTGTC TACACCGAAA 4440 

TTTTTATTTG AAAGTTCAGT ATAATGACGT ACTAAGTCTA GTTCAGCAAC TTCAGGAAAC 4500 

TCCGCTTTGT TTTTACGAAT AAATTTATCA TCTAACAATG ACTCAACAGA ATTTGTTTTA 4560 

ATATCACTTT TTGGTAATGA ATATGCATAT CTGCCTTCAC GAGATCTTTC AAAAATTAAT 4620 

GGACTTGATT TACTAGTCAT TTAACTCACC AGCCTTTTCT ACAAATGTAT CGATTTCATC 4680 

TTTTGTTCTT AATTCAGTTA CAGCTATTAA CATGTGATTT TTAAAGTCGT CTGAAACAAC 4740 

ACCTAAATCA AAACCACCGA TAATATTGTA CTTCACTAAT TCCTCGTTAA CTTGTTGAAT 4800 
TGGTTTGTCA AATTTGACTA CAAACTCATT GmnAAGnTGT ACCATCTAAT ACTTCAAAAC . 4860 

CTTTTTTAAT AAATTGTTGT TTAGCATAGT TAGCATGTTC TATATTTTGA ACTGCAATAT 4920 

CATAGATACC TTGTTTACCA AGTGCTGACA TTGCAATTGA TGaCGcTAAA GCATTTAATG 4980 

CTTGGTTAGA ACAAATATTA GATGTCGCTT TATCGCGTCG AATATGTTGT TCACGTGCTT 5040. 

GTAATGTTAA TACAAAGCCA CGATTACGTT CATCATCTTG TGTTTGACCG ACTAATCTAC 5100 

CTGGCACTTT ACGCATTAAC TTTTTCGTCG TTGCAAAATA TCCACAATGT GGCCCACCGA 5160 
ATTGAGCAGG AATTCCGAAT GGCTGAGTAT CACCTACAAC AATATCTGCA CCAAATGAAC . 5220 
CTGGAGGTGT AAGTAATCCC AATGCTAATG GATTTGCATA TACGATAAAT AATGCTTTTT ; 5280 

TATCFTCAAT AAAGCTATGA ATCTTTTCAA GATCTTCAAT TGAACCGTAA AAGTTTGGAT 5340 

ATTGTACTGC AACAGCTGCT GTTTCATCAT CCACTGCTGC TTCTAATTTT . TTCAAATCTG 5400 

TAACAGTGCC ATCTAAATCG ATTTCCACTA CTTCGAATTC CTTACGCGTC TT AG CAT AAG 54 60 
TATGAAGTAC TTGTAATGCT TGATAATGTA AACCTTTTGA GACTACAATT TTATTTTTCT . 5520 

45 TTGTTTGACT AAATGCTAAG ATACATGCTT CAGCAAAGCT AGTCATCCCA TCATACATAG 5580 
AAGAATTTGC TACATCCATA TCTGTTAATT CACAAATTAA AGTTTGGAAC TCAAAAATGG . 564 0 

CTTGTAATTC ACCTTGAGAA ATTTCCGGTT GATATGGCGT ATATGCTGTG TAAAATTCTG 5700 

50 ATCTTGAAAT CATAGCATCC ACAACTGATG GCGCGTAATG ATCATAAACA CCAGCACCCA 5760 

rAAATGATGT ATGCGTTTCT TTAGTGATAT tCTTGCTkGC AATGGGGATT TAAACnTCTA 5820 
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(2) INFORMATION FOR SEQ ID NO: 67: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18355 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 

ATnATAATTG GCTTTGCTAA TAATTACTTC CCTGAATTAC aAGTATTAGC AAACGAAATA 60 

^ AAATCTGATA TGGCTAGTTC ATTAAAACAA TGATATTTTT ATTTAAATTT TTaAAGCTTT 120 

GTACGAAATT GTACAAAGCT TTTTTGGTGC GTATTGTATG GGCAACAACT TGACGATGAA 180 

AATCCGTTAC AGGATTGGTA ATAGGAAATG TTAGCGAAAG ACAAGGGTAT CCATTGTAGA 240 

20 TTAACAAAAG GACGTTTCCA CAAGTGTGGG TTATTCTCAC TAAAGCAATA CGCAGAGACA 300 

ACTTACGTAA AATTTTGAAC TGACTAGAAC GGAACTTCTA CTCAATTATT GATAAAAATT 360 

TTCAAAAAGA CTTGAATGTG CTGAGAATAC GAAGTTTATG GAAGGATTAT CAAAATATAA 420 

25 ATGTGCATTC ATTTACAACC TTTATTGACA ATGATTCTCA ACTAATATAG TATATAATCA 480 

AATCGTAATA GTTACGATTT GTTTTCTGCA ACTTTTTTGA AGTTTTAGTT GAGGTGAAAA 540 

CAATAAAAGC ATCTAAGTGA ATGTAGTTAA CGGACAACTG CATTCGCTTG TAGAGCCACA 600 

AGAAGCSACT TTAAATAAGG TTTACGGTTG CATTTTGATA CAACAACCGA TTACTAAGTC 660 

ATGCTTTCCA CTTTGCGGGT TAGCATGACT TACCTAATAG ATAGAGCTAT TAGGTTCAGC 720 

TTCTAAAAAA TTACAGTTTT AGAGGAATAC AGTTGcTTGc tTCGCAACAA CTGCATAAGA 780 

GCCATGGTTT TCGCTTTTGC GAATTAGCAT GACTTACCTA CTAGATAGAG CTATTAGGTT 84 o' 

CATCTTCTAA AAAATTACAG GTTTAGAGGA ATACAGTTGT TTGcTTCGCA ACAACTGCAT 900 

AAGAGCCTCT AGTAATTAAA ATTACAGAGG CTCTAAAAAT ACATCTAAAG GAGTGTCGTA 96 0 

TGAATCGGCA GGTTATAGAA TTTTCTAAGT ATAATCCTTC GGGGAATATG ACGATACTTG 1020 

TTCATTCAAA ACATGATGCT AGTGAATATG CATCTATCGC CAATCAGTTG ATGGCCGCAA 1080 

45 CACATGTATG " CTGTGAACAG GTAGGCTTTA TAGrATCAAC ACAAAATGAT GATGGTAATG 1140 

ATTTTCACTT AGTTATGAGC GGTAATGAAT TTTGCGGTAA TGCGACGATG TCATATATAC 1200 

ATCATTTGCA GGAAAGTCAT TTGCTTAAAG ACCAACAGTT TAAGGTGAAG GTGTCTGGCT 126 0 

SO 

GTTCGGATTT AGTGCAATGC GCAATTCATG ATTGCCAATA CTATGAAGTT CAAATGCCAC 1320 

AAGCCCATCG TGTTGTGCCA ACAACAATTA ATATGGGTAA TCATTCATGG AAAGCAATAG 1380 
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TTCAACATTT GGTTGAAGCG TTTGTGCGTG AgcAACAATG GAGTCACAAA TATAAAACAG 1500 

TAGGTATGAT GCTTTTTGAT GAACAACGTC AATTTTTACA GCCATTAATC TATATACCAG 1560 

AAATTCAAAG TTTAATTTGG GAAAATAGCT GTGGTTCTGG TACAgcATCA ATTGGGGTTT 1620 

TTAATAATTA TCAACGTAAT GACGCATGCA AAGATTTTAC AGTACATCAG CCAGGGGGCA 1680 

GTATTTTAGT GACATCAAAG CGATGTCATC AATTGGGATA TCAAACTTCA ATT AAAGGAC 1740 

AGGTTACAAC TGTAGCTACA GGaAAAGCAT ATATAGAATA AGGAGCCTAC AATGAATAAC 1800 

TTTAATAATG AAATCAAATT GATATTACAA CAATATTTAG AAAAGTTTGA AGCGCATTAC 1860 

GAGCGTGTAT TACAAGACGA TCAATATATC GAAGCATTAG AAACATTGAT GGATGACTAT 1920 

AGTGAATTTA TTTTAAATCC TATTTATGAA CAACAATTTA ATGCTTGGCG TGACGTTGAA 1980 

GAAAAAGCAC AATTaATAAA ATCACTGCAA TATATTACAG CGCAGTGTGT TAAACAAGTG 2040 

GAAGTCATTA GAGCGAGACG TCTATTAGAC GGACAGGCGT CTACCACAGG TTACTTTGAC 2100 

AATATAGAAC ATTGTATTGA TGAAGAGTTT GGACAATGTA GTATAGCTAG CAATGACAAA 216.0 

TTATTGTTAG TTGGTTCAGG TGCATATCCA ATGACGTTAA TTCAAGTAGC AAAAGAAACA 2220 

25 GGTGCTTCAG TTAT CGGTAT TGATATTGAT CCACAAGCCG TTGACCTAGG GCGCAGAATC 2280 

GTTAACGTCT TAGCACCAAA TGAAGATATA ACAATTACGG ATCAAAAGGT ATCTGAACTT 234 0 

. AAAGATATCA AAGATGTGAC GCATATCATA TTCAG CTCGA CAATTCCTTT AAAGTACAGC 24 00 
30 . ATTTTAG AAG AATTATATGA TTTAACAAAT GAAAATGTCG /TAGTTGCAAT GCGCTTTGGT. ... -2460 

GATGGCATCA AAGCAATATT TAATTATCCG TCACAAGAAA CAGCGGAAGA TAAGTGGCAA 2520 
. ■ . . TGTGTGAATA AACATATGAG ACCACAGCAA ATTTTTGATA TAGCACTTTA TAAAAAAGCA ., ' 2580 
GCTATAAAGG TAGGTATTAC GGATGTCTAA ATTATTAATG, ATAGGCACTG r GTCCgGTCGC . ;2640 

AATGCAATTA GCGAATATTT GCTATTTAAA AT CAGATTAT GAGATTGATA TGGTTGGACG 2700 

TGCCTCAACA TCAGAAAAAT CAAAACGCTT ATATCAAGCG TATAAAAAAG AGAAACAATT 2760 

TGAAGTCAAA ATACAAAACG AGGCGCATCA ACATCTGGAA GGTAAGTTTG AAATTAATCG 2820 

TTTGTATAAA GATGTTAAAA ACGTTAAGGG TGAATACGAA ACGGTTGTCA TGGCATGCAC 2 8 BO 

4S AGCAGATGCT TATTATGACA CACTACAGCA ATTGTCGTTA GAAACTTTGC AAAGTGTCAA 2940 

ACATGTCATT TTAATATCAC CGACATTTGG TTCGCAAATG ATTGTCGAAC AATTTATGTC 3 000 

TAAATTTAAT AAAGATATCG AAGTGATTTC ATTCTCAACT TATCTTGGCG ATACACGTAT 3 060 
50 TGTTGATAAA GAAGCGCCTA ATCATGTGTT GACAACAGGT GTAAAAAAGA AATTGTACAT . 3120 

GGGATCGACA CATTCAAACT CAACAATGTG TCAACGAATC TCTGCTTTAG CTGAGCAATT 31 BO 
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TTATGTGCAC CCACCACTAT TTATGAATGA CTTTTCATTG AAAGCCATTT TCGAAGGAAC 3300 

AGATGTACCG GTTTATGTGT ATAAGTTATT TCCTGAAGGA CCGATAACGA TGACACTAAT 336 0 

CCGTGAAATG CGTTTAATGT GGAAGGAAAT GATGGTTATT TT ACAAG CAT TTAGAGTGCC 3420 

GTCAGTCAAC CTGCTTCAAT TTATGGTGAA GGAAAATTAT CCAGTACGTC CTGAAACTTT 3480 

GGATGAAGGT GATATTGAGC ATTTCGAAAT CTTGCCAGAT ATCTTACAAG AATAT CTGCT 354 0 

TTATGTAAGA TATACCGCAA TCCTCATTGA TCCATTTTCA CAGCCAGACG AAAACGGACA 3600 

TTACTTTGAT TTTTCAGCTG TACCATTTAA GCAAGTCTAT AAAAATGAAC AGGATGTTGT 3660 

TCAAATTCCA AGAATGCCAA GTGAAGATT A TTACAGAACG GCGATGATTC AGCATATTGG 3720 

GAAAATGCTA GGTATCAAAA CGCCAATGAT TGATCAGTTC CTAAGTCGCT ATGAAGCAAG 3780 

TTGCCAGGCG TACAAGGATA TGCATCAAGA TCAACACTTA TCTTCTCAAT TTAATACAAA 3 840 

TCTATTTGAA GGAGATAAAG CACTCGTCAC AAAATTTTTG GAAATCAATA GAACGCTTTC 3900 

ATAATAAGGG TTTGAAGTTT TATAATAGAA AAAAATTATT GAATTATGTT TGACATTTAC 3 960 

ATAAAAATAA GCAAATAATT GAGAAAAATA ATCATTACGA TTTGATTAAG TAATGCAACT 4020 

TATCAATTTA GAAAGAGGAA AAGCAAATGA GAAAACTAAC TAAAATGAGT GCAATGTTAC 4080 

TTGCATCAGG GCTAATTTTA ACTGGTTGTG GCGGTAATAA AGGTTTAGAG GAGAAAAAAG 4140 

AAAACAAGCA ATTAACGTAT ACGACGGTTA AAGATATCGG TGATATGAAT CCGCATGTTT 4200 

ACGGTGGATC AATGTCTGCT GAAAGTATGA TATACGAGCC GCTTGTACGT AACACGAAAG 4260 

ATGGTATTAA GCCTTTACTA GCTAAAAAGT GGGATGTGTC TGAAGATGGG AAGACATACA 4 3 20 

CGTTCCATTT GAGAGATGAC GTTAAATTCC ATGATGGTAC GCCATTTGca TGctGACGCA 43 80 

GTTAAGAAAA ATATTGACGC AgTTCAAGAA AACAAAAAAT TGCATTCTTG GTTAAAGATT 4440 

TCGACATTAA TTGACAATGT TAAAGTTAAA GATAAGTACA CGGTTGAATT GAATTTGAAA 4500 

GAAGCATATC AACCTGCATT GGCTGAATTA GCGATGCCTC GTCCATATGT ATTTGTGTCT 4560 

CCAAAAGACT TTaAAAACGG TACAAcAAAA GATGGCGTTA AAAAGTTCGA TGGTACTGGT 4 620 

CCATTTAAAT TAGGTGAACA CAAAAAAGAT GAGTCTGCAG ACTTTAACAA AAATGATCAA 4680 

TACTGGGGCG AAAAGTCTAA ACTTAACAAA GTACAAGCAA AAGTAATGCG TGCTGGTGAA 4740 

ACAGCATTCC TATCAATGAA AAAAGGTGAA ACGAACTTTG CCTTGACAGA TGATAGAGGT 4 800 

ACAGATAGCT TAGACAAAGA CTCTTTAAAA CAATTGAAAG ATACAGGTGA CTATCAAGTT 4 860 

AAGCGTAGTC AACCTATGAA TACGAAAATG TTAGTTGTCA ATTCTGGTAA AAAAGATAAC 4 920 

GCTGTGAGTG ACAAAACAGT CAGACAAGCG ATTGGTCATA TGGTAAACAG AGATAAAATT 4980 
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ACAGACATTA ATTTCGATAT GCCAACACGT AAGTATGACC TTAAAAAAGC AGAATCATTA 5100 

TTAGATG AAG CTGGTTGGAA GAAAGGTAAA GACAGCGATG TTCGTCAAAA AGATGGTAAA 516 0 

5 AACCTTGAAA TGGCAATGTA CTATGACAAA GGTTCTTCAA GTCAAAAAGA ACAAGCAGAA 5220 

TACTTACAAG CAGAATTTAA GAAAATGGGT ATTAAGTTAA ACATCAATGG CGAAACATCA 5280 

GATAAAATTG CTGAACGTCG TACTTCTGGT GATTATGACT TAATGTTCAA CCAAACTTGG 5340 

10 GGATTATTGT ACGATCCACA AAGTACTATT GCAGCATTTA AAGAGAAAAA TGGTTATGAA 5400 

AGTGCAACAT CAGGCATTGA GAACAAAGAT AAAATATACA ACAG CATTG A TGACGCATTT 5460 

AAAATCCAAA ACGGTAAAGA GCGTTCAGAC GCTTATAAAA ACATTTTGAA . ACAAATTGAT 5520 

15 

GATGAAGGTA TCTTTATCCC TATTTCACAC GGTAGTATGA CAGTTGTTGC ACCaAAAGAT 5580 

TTAGAAAAAG TATCATTCAC ACAATCACAG TATGAATTAC CATTCAATGA AATGCAGTAT 5640 

2Q AAATAAAGGA GCAATTAGAT GTTCAAATTT ATCTTAAAAC GTATTGCGCT CATGTTTCCA 5700 

TTGATGATTG TAGTAAGTTT TATGACATTT CTATTGACGT ATATTACAAA TGAAAATCCA 5760 

GCTGTGACAA TTTTACATGC ACAAGGGACG CCAAATGTAA CACCAGAGTT GATTGCAGAA 5820 

25 ACGAATGAGA AGTACGGTTT CAATGATCCA TTATTAATTC AATATAAAAA TTGGTTACTT 5880 

GAAGCGATGC AATTTAATTT TGGTACAAGC TACATTACAG GTGAC CCAGT TGCTGAACGT 594 0 
ATTGGTCCAG CATTTATGAA TACATTGAAA TTAACAATAA TTTCAAGTGT TATGGTGATG * 6000 

30 ATTACATCAA TTATTTTAGG TGTAGTTAGT GCATTAAAAA GAGGAAAGTT CACTGATCGT ; ' ■< 6 060 

GCGATACGTT CAGTGGCTTT CTTTCTAACT GCATTACCAT CATATTGGAT AGCTTCAATA 6120 

CTTATTATTT ACGTTTCAGT GAAGTTAAAC ' ATATTGCCGA" CTTCTGGATT . AACAGGTCCA . 6180 

35 

GAAAGTTAGA TATTGCCAGT GATCGTTATT ACGATTGCCT ATGCTGGTAT TTACTTTAGA - ! • 6240 

AATOTTAGAC GCTCGATGGT GGAACAATTA AATGAAGATT ATGTACTTTA TTTAAGAGCA 63 00 

AGCGGTGTGA AATCTATCAC ATTAATGTTG CATGTGTTGC GTAATGCTTT ACAAGTTGCG 6360 

40 

GTATCAATCT TTTGTATGTC TATACGAATG ATAATGGGTG GACTAGTTGT TATGGAGTAT 64 20 

ATCTTTGCAT GGCCTGGACT AGGTCAATTA AGTTTAAAAG CAATACTTGA ACACGATTTT 64 80 

45 CCAGTCATTC AAGCATATGT ATTAATTGTA GCGGTATTAT TTATTGTATT TAATACATTA 6540 

GCAGATATCA TTAATGCGCT ATTAAATCCA AGATTAAGGG aGGGCGCACG ATGATAATTT 6 600 

TAAAmCGATT ATTmCArGwT AAAGGTGCAG TAATTGCTTT AGGCATTATT GTATTATATG 6660 

SO TCTTTTTAGG ATTAGCAGCA CCACTTGTGA CATTTTATGA TCCTAACCAT ATCGATACAG 6720 

CAAACAAATT TGCTGGCATG AGTTTTCAAC ATCTACTAGG TACTGACCAT TTAGGTAGAG 6780 

55 



495 



EP 0 786 519 A2 



10 



15 



20 



25 



30 



35 



40 



45 



SO 



TATTTGTTTC TGTACTTATT GGATCTATTT TAGGATT CTT ATCAG GAT AT TTCCAAGGGT 6900 

TTGTTGACGC CTTAATCATG CGTGCGTGTG ATGTTATGTT GGCATTCCCA AGTTATGTTG 6960 

TAACGTTAGC ATTAATTGCA TTGTTTGGAA TGGGTGCCGA AAATATTATC ATGGCATTTA 7020 

TTTTGACGCG TTGGGCATGG TTCTGTCGTG TTATACGTAC AAGTGTTATG CAGTACACTG 7080 

CTTCTGACCA TGTAAGATTT GCTAAAACAA TCGGTATGAA TGATATGAAA ATTATTCACA 714 0 

AAGATATTAT GCCATTAACA TTAGCAGATA TTGCTATCAT CTCTAGTAGC TCGATGTGTT 7200 

CAATGATCTT GCAAATATCT GGCTTTTCAT TTTTAGGATT AGGTGTCAAA GCGCCTACTG 7260 

CAGAGTGGGG CATGATGCTT AACGAaGCTA GAAAAGTGAT GTTTACACAT CCTGAAATGA . 7320 

TGTTTGCGCC AGGTATTGCC ATAGTGATTA TAGTGATGGC ATTTAACTTC TTATCCGATG 7380 

CTTTACAAAT TGCTATTGAT CCCCGCATCT CTTCTAAAGA TAAACTTCGT TCTGTGAAAA 7440 

AAGGAGTGGT GCAATCATGA CATTGTTAAC AGTTAAACAT TTGACGATTA CAGATACCTG 7500 

GACAGATCAA CCACTCGTGA GTGATGTGAA TTTTACATTA ACTAAGGGTG AAaCTTTAGG 7560 

CGTTATTGGA GAAAGTGGTA GTGGTAAATC AATCACTTGT AAATCGATTA TTGGTTTGAA 7620 

TCCCGAACGA CTCGGGGTGA CAGGTGAAAT TATCTTTGAT GGTACAtCAA TGTTGTCATT . 7680. 

ATCTGAATCG CAATTGAAAA AGTACCGTGG TAAAGACATT GCGATGGTCA TGCAACAAGG 7740 _ 

TAGTCGTGCC TTTGACCCAT CAACTACTGT CGGTAAACAA ATGTTTGAGA CTATGAAAGT 7800 

ACATAGGTCA ATGTCTACAC AAGAAATTGA AAAGACATTG ATTGAATATA -TGGATTATTT 786 0:- 

AAGTTTGAAA GATCCTAAAC GTATATTAAA ATCATACGCT TACATGTTAT CAGGAGGAAT . 7920 

GTTACAG CGA TTGATGATTG CTTTAGCGTT AgcTTTgAAA CCAAAGTTAA TCATTGCTGA 7980; 

TGAGCCGACA ACGGCTTTAG ATACAATTAC ACAATATGAT GTACTGGAAG CATTTATAGA 8040r 

TATTJ^AAAAA CACTTTGACT GTG CGATGAT TTTCATTTCA CATGATTTAA CGGTTATTAA 8100*1 

CAAGATTGCA GACCGTGTTG TTGTGATGAA AAATGGtCAG GTTATTGAAC AAGGGACACG 8160 

TGAATCAGTC TTG CATCATC CAGAACATGT TTATACGArt ATTkTATTAT CAACGAAGAA 8220 

GAAGATTAAT GATCATTTTA AACATGTGAT GAGGGGTGAT GTACATGATT AAAATTAAAG 8280 

ATGTTGAAAA GTCATATCAA AGCGCACATG TTTTTAAGCG TCGTCGAACA CCTATCGTGA 8340 

AAGGTGTGTC ATTTGAGTGT CCAATCGGTG CGACGATTGC GATTATCGGA GAAAGTGGTA 8400 

GCGGTAAATC GACGTTGAGT CktATGATAT TAGGTATTGA GAAACCGGAT AAAGGTTGTG 8460 

TAACCTTAAA TGATCAAC CG ATGCATAAGA AGAAAGTGAG ACGTCATCAA ATTGGTGCTG 8520 

TATTTCAAGA TTATACGTCA TCATTACATC CATTTCAGAC TGTTAGAGAA ATCTTATTTG 8580 
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TGTTGGAAGA AGTCGGTCTA TCTAAGGCAT ACATGGATAA ATATCCTAAT ATGTTATCAG 8700 

GTGGAGAGGC GCAACGTGTT GCGATTGCGC GTGCAATATG TATTAACCCT AAATATATTT 8760 

TGTTTGATGA AGCCATTAGT TCACTCGACA TGTCAATTCA AACACAAATA TTAGATTTAT 8820 

TGATTCATTT ACGTGAAACG CGTCAGTTGA GTTATATTTT TATCACACAT GATATTGAAG 8880 

CTGCCACGTA TTTATGTGAT CAATTAATTA TTTTTAAAAA CGGAAAAATA GAAGAACAAA 8 94 0 

TTCCGACAAG CGCATTGCAT AAAAGTGACA ATGCTTATAC AAGAGAATTA ATAGAAAAAC 9000 

AACTATCATT CTAAGGAGTG AGATAATGAA AGGTGCAATG GCTTGGCCCT TTTTGAGATT 9060 

ATATATATTA ACATTGATGT TCTTTAGTGC CAATGCAATG TTAAACGTGT TTATACCTTT 912 0 

ACGAGGGCAT GATTTAGGCG CAACGAATAC GGTTATCGGT ATCGTTATGG GGGCATACAT 9180 

GTTAACAGCA ATGGTATTTC GACCATGGGC AGGACAAATT ATTGCTCGTG TCGGTCGCAT 9240 

TAAAGTATTA AGAATTATTT TGATTATCAA TGCCATAGCT TTAATTATTT ATGGTTTTAC 9300 

TGGCTTAGAA GGTTATTTCG TAGCACGTGT TATGCAAGGT GTGTGTACGG CATTCTTTTC 9360 

TATGTCTTTA CAGCTAGGTA TTATTGATGC ATTACCAGAG GAACATCGTT * CTGAAGGTGT 9420 

ATCATTGTAC TCGCTATTTT CAACGATTCC AAACTTAATC GGACCATTAG TTGCCGTAGG ' 94 8 0 

TATTTGGAAT GCAAATAATA TTTCACTATT ' TGCAATTGTC ATTATCTTTA TCGCATTAAC 954 0 

. AACAACATTC TTTGsTATGG CGTGACCTTT GCTGAAGAGG AACCCGATAC. GTCAGATAAG ' 9600 

ATTGAAAAAA TGCCGTTTAA CGCTGTAACT GTTTTTGCGC AATTTTTCAA AAATAAAGAG 9660 

TTGTTGAACA GTGGTATTAT CATGATTGTT GCATCGATTG TATTTGGTGC AGTTAGTACA 9720 

TTTGTACCGT TATACACAGT GAGTTTAGGA TTCGGGAATG . CGGGAATCTT V TTTGACAATA ; . 9780 

, CAGGCCATCG CAGTTGTTGC GGCAAGATTT TACTTAAGGA AATACATTCC' GTGAGATGGT . - 9840 

ATGTGGCATC CTAAATATAT GGTATCTGTA CTATCATTAT TAGTAATCGC GTCATTTGTA 9900 

GTGGCATTTG • GTCCGCAAGT AGGTGCAATT ATTTTCTATG GTAGTGCGAT ATTAATAGGA 9960 

ATGACGCAAG CAATGGTGTA CCCAACATTA ACATCATACT TAAGCTTCGT CTTACCAAAA 10020 

GTAGGTCGTA ATATGTTGTT AGGTTTATTT ATTGCCTGTG CAGACTTAGG TATATCGTTA 10080 

GGTGGCGCAT TGATGGGACC TATTTCCGAT TTAGTAGGAT TTAAATGGAT GTATCTAATT 10140 

TGTGGTATGT TAGTCATTGT AATAATGATT ATGAGTTTCT TGAAAAAGCC AACACCACGT 10200 

CCAGCGAGTA GTCTTTAATG AAGTGAATTA AAG CAT ATT A AGTTAATGAA TATTTAAATT 10260 

TTAAAAGGTA TATTGaGCAT GGCGATTCAT GTGCTTCATG CTAGGACATG AAACATTCTA 10320 

TATGGCTCGT TTTTAGAACG ACAtATATCT AAATAAAGCA CGCTTArAAG TGAGTTTTGA 10380 



55 



497 



EP0 786 519 A2 





TTACATGAAA 


ATATGCAAAA 


CGAGTATAAC 


TGCTAATTGA 


, TAGAAATAGC 


TCACCATAAA 


10500 




ATTACGGTAT 


GATTTTAAAT 


ATAAGTAAGT 


CGCACTACCT 


GCTAGTATCA 


ATGCTGGAAT 


10560 


s 


GAATTCCCAC 


CATGTATTAA 


TGT ATGGAT A 


GTAGAACAGA 


GTTTCAAGGA 


TAATGGACAA 


10S20 




T ACTATTGTA 


ATCTTTAAAG 


GTATTAATCT 


GCTTAATTCT TGAATTAAAA 


TATGAGGGAA 


10680 


10 


AATAAGTTGA 


CAAATCAAAG 


TATTTAATAT 


AATGGTTAAC 


GAAAATATAG 


CTATTAAACT 


10740 


GATGGAaCCA 


TACCCTTTAA 


TGAGCGGGTA 


AATGTCAAAG 


ACAGTAAAGG 


AATCTACATT 


10800 




TAGTGCGAAA 


ATATTGAAAT 


GATTTAAAAG 


TAAAAAGAGT 


ACGACACTTA 


GTGTAAATGA 


10860 


1S 


TATAAGAATA 


TGCCATTTAT 


ATTTAGCACT 


AGCAACGATT 


TGCGAACGTA 


TCATTGGAAT 


10920 




AAACGCATCT 


TGATGCATCA 


GACGAAAAAT 


AGCTAGTGAA ATAATAACTG 


CGAGTAAATA 


10980 




GCTAATGTTC 


ATTGAAATAG 


GAAAAGAGAA AG CCCACGGA GCTTGTTGAG 


TGAATACAGC 


11040 


20 


TACTAACCCA 


AAAGTTAAAA 


AGACGATAAT 


GATCGGCAAG 


ATGTTAACCA 


AAAATATGTA 


11100 




AAGGAAAATA 


AATCCAATAT 


CACGTTTGAA 


AAAACGCGAf 


TGTTCGGTAG 


CGTATTCTTC 


11160 




TTCTATGTAA 


TGTTTATTTG 


TATTTGAGAT 


AGTATACCTC 


TTAAATAGTT 


GTATTATATA 


11220 


25 


GATACTTTAG 


CACATATTAC 


TTTGTATTGT 


ATGTTTTATA 


CATTAAAATT 


TAAAATGAAA 


11280 * * 




AACATATCAT 


AAAATTGTTT 


TATAAAATGA 


AGCGCTTCCA 


TTGTGTTTTG 


TTTTGTAAGG 


11340 P 




TGTATCATAA 


ATATTGAATT 


GAAATTTTGG 


GGGGAGGTAT 


TGTAATGACG 


TTTCTTACAG 


11400 


30 


TCATGCAATT 


TATAGTTAAG 


ATTATCGTTG 


TAGGATTCAT 


GCTTACGGTT 


ATTGTTATCG 


11460 ^ 




GGCTTATTTG 


GTTAATTAAA 


GATAAAAGAC 


AATCACAACA 


TAGTGTATTA 


AGGAATTATC 


11520 


35 


GTTTACTAGC 


ACGTATTAGA 


TATATTTCAG 


AAAAAATGGG 


ACCGGAATTA 


CGTCAGTATT 


11580 - 


TATTTTCTGG 


GGATAATGAA 


GGGAAACCTT 


TTTCACGTAA 


TGATTATAAA 


AATATCGTTT 


11640 




TGGCfGGAAA ATATAACTCT 


CGTATGACCA 


GCTTCGGTAC 


TACTAAAGAT 


TATCAAGACG 


11700 " , " 


40 


GCTTTTACAT 


AGAGAAGACA 


ATGTTTCCGA 


TGCAACGTAA . TGAGATTTCA 


GTAGATAATA 


11760 


CAACATTGTT 


ATCAACATTC 


ATTTATAAAA 


TCGCGAATGA 


GCGTTTATTT 


AGTCGTGAAG 


11820 




AATATCGTGT 


GCCGACAAAG 


ATTGATCCGT 


ATTACTTAAG 


TGATGACCAT 


GCAATAAAAT 


11880 


45 


TAGGTGAACA 


TTTAAAACAT 


CCATTTATTT 


TAAAACGTAT 


CGTAGGACAA 




11940 




GTTATGGCGC 


TTTAGGAAAA 


AATGCCATTA 


CAGCTTTATC 


TAAAGGTCTA 


GCTAAAGCGG 


12000 




GCACTTGGAT 


GAATACAGGT 


GAAGGTGGCT 


TATCAGAATA 


TCATTTAAAA 


GGTAATGGGG 


12060 


50 


ATATCATTTT 


CCAAATTGGT 


CCCGGTTTAT 


TTGGTGTTCG 


TGATAAAGAA 


GGTAATTTTA - 


12120 




GTGAAGGTTT 


ATTTAAAGAG 


GTTGCACAGT 


TATCTAACGT 


ACGCGCATTT 


GAGCTGAAGT 


12180 
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TTGCTAAAAT 


CCGAAATGTT 


GAACCTTATA AAACAATCAA TTCACCTAAC 


CGTTACGAAT 


12300 




TTATTCATAA 


TGCTGAAGAT 


TTGATTCGTT 


TCGTCGATCA 


GTTGCAGCAA 


TTAGGTCAAA 


12360 


5 


AACCAGTAGG 


ATTCAAAATT 


GTAGTAAGCA 


AAGTTTCAGA , AATTGAAACA 


CTTGTACGTA 


12420 




CGATGGTGGA 


ACTAGATAAG 


TATCCAAGCT 


TTATTACGAT 


TGATGGTGGT 


GAAGGTGGTA 


12480 


10 


CTGGTGCAAC 


ATTCCAAGAA 


TTACAAGATG 


GTGTTGGCTT 


ACCGCTATTT 


ACAGCTCTAC 


12540 


CTATTGTGTC 


TGGCATGTTA 


GAAAAATATG 


GTATTCGAGA 


TAAAGTGAAA 


TTGGCGG CAT 


12600 




CTGGTAAGTT 


AGTGACACCA 


GATAAAATTG 


CGATTGCACT 


AGGTTTAGGT 


GCAGATTTTG 


12660 


15 


TAAATATCGC 


ACGTGGGATG 


ATGATTAGTG 


TCGGTTGTAT 


AATGAGTCAA 


CAATGTCACA 


12720 




TGAATACGTG 


TCCTGTAGGT 


GTTGCAACGA 


CAGATGCGAA 


GAAAGAAAAA 


GCATTGATTG 


12780 




TTGGAGAAAA 


GCAATATCGT 


GTCACAAACT 


ATGTAACAAG 


TTTGCATGAA 


GGCTTATTCA 


12840 




ATATTGCAGC 


AGCTGTTGGC 


GTATCCAGTC 


CTACAGAAAT 


TACTGCTGAT 


CATATTGTAT 


12900 




ATCGAAAAGT 


CGATGGTGAG 


TTACAAACGA 


TACATGATTA 


TAAATTAAAA 


CTCATTAGTT 


12960 




AACTTAATTA 


TTTCGGGAAA 


TTGAAAGCAG 


CGGATTTTAG 


CGTTACTGCA 


AATAATTTTA 


13020 


25 


TATTAGTAGT 


GGATGCTGGT 


CACACAAGAA 


CTTCAAATAT 


TAAAGCCCTC 


AGAATATGAA 


13080 




TTAAGGTTTG 


TAACCTTAGT 


CTTATCTGAG 


GGCATTTTTA 


AGTTATAAAC 


TATTTGTCGT 


13140 




C CATTTTATC 


TTTTTCTTTT 


AAACCTCTGT 


GCTTTAATTG 


CTTTTCAAGT 


TTTTCAAAAC 


* 13200 


30 


TAATATCTTT 


ATTTTCTTTA 


GTCGAAACAC 


CAAGACGTTT 


ATTTAATTTT TTCATGTCAA 


13260 




CTTCTGTGTA 


ATCTATGTCT 


AAGTGyTCAA TTGCTTTTTT 


ATCTTTATAG 


TCTACTTTGT 


13320 


35 


ATTTTACGCC 


TTTAAGGTCT 


TTGAAAATAC 


TTTCAGATTT 


GGCGAATAAC 


TTTTTGG CTT " 


13380 


CGTCTTTATC 


CATACCTAGA 


TCGTCATATT 


TAATTGTGTT 


GATTGTAGAC < 


TGTTTTAAAA 


™ 13440 




CTTTATCATC 


TTTATATGTG 


ATAGAAGTTA 


GTACATGTTT 


ACCACTAACA 


TCACCwTCAT 


13500 


40 


ATGTTTTGGT 


TTGTTCTTTA 


CCACAAGCTG 


ATAATGCAAT 


GATACAAACT 


AATGCTACTA 


13560 




CAATTAATGA 


ACATAATTTT 


TTCAAAGTCA 


GTCGCCTTCT 


TTCGATATTT 


GTATTATAAA 


13620 




GAAATTATAA 


CATTTACTAA 


AAAATGATGT 


TATTCAAAAA 


TTTAAATTTT 


GTCATTTTTl' 


13680 


45 






AAGCGGATTC 


CTCACAAAAT 


TTTAAAAATA 


TTTAAGCCTk 


13740 




AAAATGATAA 


AGCGkTAGGG 


AACGTTTTTC 


TGAAAGTTAG 


TGATACAATA 


GTTTTAAGTT 


13800 




GAAATACAGG - 


AGGATGAATA 


ACATGAATCA 


GTCAGTCAAA 


TTACTTAAAC 


ATTTAACAGA 


13860 


50 


TGTAAACGGC 


ATTGCTGGTT 


ATGAAATGCA 


AGTTAAAGAA 


GCAATGCGTa 


ACTATATAGA 


13920 




GCCTGTCAGT 


GATCAAATTA 


TTGAAGATAA 


CTTGGGTGGC 


ATTTTTGGAA 


AGAAAAATGC 


13980 
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